I'll explain this with mysql query.
select * from stock where home>away
home away
1 3
2 1
1 0
4 5
I just wanna make same query with mongodb. but I couldn't make this.
array('column' => array('$gt' => what should I write here?))
I need some help please for PHP usage.
You can not do this directly with a MongoDB query. Queries in MongoDB only allow comparisons with static values. However, there are a few options.
First of all, you can just store the result of the comparison whenever you update the values in each field, f.e., you can store it as:
home away gt
1 3 0
2 1 1
1 0 1
4 5 0
This is the simplest solution, and has the additional benefit that you can set an index on gt. Of course, it does mean more overhead when updating values. Doing this sort of pre-calculation is very similar to denormalisation. Denormalisation is something you will often have to do in NoSQL databases in order to make most of the system.
There is an alternative, but it wouldn't allow you to do an indexed search on >. You can use the aggregation framework in the following way:
db.so.aggregate( [
{ $project: {
'away' : 1,
'home': 1,
'better_at_home': { $cmp: [ '$home', '$away' ] }
} },
{ $match: { 'better_at_home': { $gt: 0 } } }
] );
In the first step, we use $cmp to compare home and away. In the second step ($match), we then filter out all the documents where the difference is less than or equal to 0.
The answer of the aggregation is:
{
"result" : [
{
"_id" : ObjectId("51ee7cfb812db9ff4412f12f"),
"home" : 2,
"away" : 1,
"better_at_home" : 1
},
{
"_id" : ObjectId("51ee7cff812db9ff4412f130"),
"home" : 1,
"away" : 0,
"better_at_home" : 1
}
],
"ok" : 1
}
Sadly, $gt cannot compare two fields.
What you can do for non time critical queries is to use $where;
> db.test.insert({home:5, away:3})
> db.test.insert({home:1, away:3})
> db.test.find({$where: 'this.home > this.away'})
{ "_id" : ObjectId("51ec576418fd21f745899945"), "home" : 5, "away" : 3 }
Your performance will be better though if you just store an additional "diff" field in the row object and search on that using $gt:0.
> db.test.insert({home:5, away:3, diff:2})
> db.test.insert({home:1, away:3, diff:-2})
> db.test.find({diff: {$gt:0}}, {_id:1,home:1,away:1})
{ "_id" : ObjectId("51ee982a6e4b3b34421de7bc"), "home" : 5, "away" : 3 }
You cannot use normal querying for this, however, never fear, the aggregation framework can help with the $cmp operator: http://docs.mongodb.org/manual/reference/aggregation/cmp/
db.collection.aggregate({$project:{c: {$cmp:['$col1','$col2']}}},{$match:{c:{$gt:0}}})
However, this being said it isn't very index friendly and is currently limited to 16meg of results.
edit
To take your edit into account:
$mongo->collection->aggregate(array(
array('$project'=>
array(
'c'=>array('$cmp'=>array('$col1','$col2'))
)
),
array('$match'=>array('c'=>array('$gt'=>0)))
))
Related
I need help with MongoDb with PHP driver.
I have 4 collections:
order_aproved :
{
"order_id" : mongoId ,
"user_id":num ,
"order_date":mongoDate ,
"requset" : string
}
orders_rejected :
{
"order_id" : mongoId,
"user_id" : num ,
"order_date" : mongoDate ,
"requset" : string
}
users :
{
"user_id" : mongoId,
"username" : num ,
"last_order" : mongoDate ,
"num_orders" : num,
"last_order"
}
orders_log :
{
"order_id" : mongoId ,
"order_date" : mongoDate ,
"status" : boolen ,
"user_id" : num
}
Every approved/rejected order, I update the num_orders on user document
that have a new/rejected order. So that number is always changing
and log that order on orders_log.
I need to fetch all orders approved/rejected on orders_log by list of users [array] with condition and get the orders count num_orders and last order date for that user by the order from this user
I am doing it like this:
$cursor = $orders->find()->sort(array("order_date" => -1))->limit(15);
$array = iterator_to_array($cursor,false);
$users_for_aproved = ["123","124","125"];
$users_for_rejcted = ["112","113","114"];
$js = "function() { if ( this.requset ) { return this.requset.length > 0 } }";
$query1 = array( '$and' => array(
array("user_id" => array('$in'=> $users_for_aproved)),
array('$where' => $js )
));
$query1 = array( '$and' => array(
array("user_id" => array('$in'=> $users_for_rejcted)),
array('$where' => $js )
));
$query_or = array('$or' => array($query,$query1);
$cursor = $orders_log->find($query_or)->sort(array("order_date" => -1))->limit(15);
$array = iterator_to_array($cursor,false);
for ( $x=0; $x < count($array) ; $x++ ) {
$query = array( "user_id" => $order["user_id"] );
$cursor = $orders->find($query)->limit(1);
$array = iterator_to_array($cursor,false);
$order_count = $array[0]["num_orders"];
$array[$x]["order_count"] = $order_count;
}
return $array;
It's working but its not very efficient , i need a way to fetch data from another collection and add the num_orders to the doc that i have find without a use form anther query
like SQL JOIN but on mongo and php driver
Thanks!
There are two ways to achieve gain in performance in this case:
create an index on MongoDB:
db.order_aproved.createIndex( { user_id: 1 } )
You may create the above index either in the above way, or in the background:
db.order_aproved.createIndex( { user_id: 1 }, { background: true } )
In the last case, the creation will be slower, but it will not bother the currently ongoing operations on the database. If you may afford it, I think you should better create an index not in the background, esp. if are not running this script on the Production Database
re-design the collections, so that instead of the different collections, joined by some ID, you should create embedded documents inside the main document, thus eliminating the need to perform any operations, similar to JOINs in RDBMSs.
Of the above, simplest and more straight forward solution in your case, seems to me the first one. Choosing it, you will also avoid performance losses in updates for embedded documents
We have configured ElasticSearch to create different index based on the date.
health status index pri rep docs.count docs.deleted store.size pri.store.size
yellow open localbeta-logstash-2016.07.20 5 1 85636 0 27mb 27mb
yellow open .kibana 1 1 2 0 9.6kb 9.6kb
yellow open localbeta-logstash-2016.07.21 5 1 108346 0 37.7mb 37.7mb
yellow open localbeta-logstash-2016.07.22 5 1 58172 0 22.1mb 22.1mb
yellow open localbeta-logstash-2016.07.19 5 1 11535 0 3.6mb 3.6mb
Now we have to make a query for to fetch the logs against a particular field from all the indexes.
From the PHP-ElasticSearch, I understand that it is easy to query a particular index.
But how to query all the indexes at once?
You have to query for 'index' => '*'
* is a wildcard
You can use _all or empty string to refer to all indices: https://github.com/elastic/elasticsearch-php/blob/master/src/Elasticsearch/Client.php#L814
$params['index'] = (list) A comma-separated list of index names to search; use _all or empty string to perform the operation on all indices
Searching for different indexes works in the same way as searching for in a single index. You do not need to change anything in your query. You can use _all or like Vuldo mentioned '*' which is a wild card as the value for you index or just leave it empty and it will search all types in all indices.
You can take a look here for more details
In Kibana, you would search all docs with request below:
GET YOUR_INDEX_NAME/_search
{
"query": {
"match_all": {}
}
}
same for php, you create a same request body but with different syntax and would look like this:
[
"query" => [
"match_all" => new \stdClass()
]
]
I am using MongoDB and in the past I have been able to use the following to insert or add to a sub array that was already in the DB
Here is my issue, every day we take a look at the itunes top 100 and we insert the songs and artist into our collection, well infact we use two collections to do this job
but the one I am having issue with is the one that we store every single song and artist that has ever appeared in the iTunes top 100.
see code below
$collection = static::db()->itunes_collection_australia;
$document = $collection->findOne(array('song' => (string)$entry->imname, "artist"=>(string)$entry->imartist));
if (null !== $document) {
$collection->update(
array(array('song' => (string)$entry->imname, "artist"=>(string)$entry->imartist)),
array('$push' => array("date" => array('$each'=> array((string)$date)),"chartno"=> array('$each'=> array($a))),
));
}
else
{
$collection->insert(array("song"=>(string)$entry->imname, "artist"=>(string)$entry->imartist,"preview"=>(string)$preview,"cd_cover"=>(string)$cover, "price"=>(string)$price,"date"=>array((string)$date),"chartno"=>array($a)));
}
what should be happening is if the artist and song is found to already be the the collection , it should update. at the moment it is not running anything.
and if it is updating its not doing it right.
You see the "date" field should be showing multiple dates same with the chartno it should also be showing what position it was in the charts on that day.
here is how it should look when first inserted.
{
"_id" : ObjectId("52ea794d6ed348572d000013"),
"song" : "Timber (feat. Ke$ha)",
"artist" : "Pitbull",
"preview" : "http://a1264.phobos.apple.com/us/r1000/030/Music6/v4/48/30/3c/48303ca0-c509-8c15-4d4a-7ebd65c74725/mzaf_5507852070192786345.plus.aac.p.m4a",
"cd_cover" : "http://a1082.phobos.apple.com/us/r30/Music6/v4/64/41/81/644181ba-d236-211d-809e-057f4352d3d8/886444273480.170x170-75.jpg",
"price" : "$2.19",
"date" : [
"2014-01-29T07:10:38-07:00"
],
"chartno" : [
20
]
}
when the script sees it is back in the top 100 it should add it to the date and chartno fields.
like so
{
"_id" : ObjectId("52ea794d6ed348572d000013"),
"song" : "Timber (feat. Ke$ha)",
"artist" : "Pitbull",
"preview" : "http://a1264.phobos.apple.com/us/r1000/030/Music6/v4/48/30/3c/48303ca0-c509-8c15-4d4a-7ebd65c74725/mzaf_5507852070192786345.plus.aac.p.m4a",
"cd_cover" : "http://a1082.phobos.apple.com/us/r30/Music6/v4/64/41/81/644181ba-d236-211d-809e-057f4352d3d8/886444273480.170x170-75.jpg",
"price" : "$2.19",
"date" : [{
"2014-01-30T07:10:38-07:00"
},{2014-01-31T07:10:38-07:00}],
"chartno" : [
{20},{30}
]
}
however that is not happening infact nothing seems to be getting added.
I am wondering if I have done something wrong? Well clearly I have.
I have also tried the following '$addToSet' but with no success.
your update statement is wrong. you have too many arrays on first parameter. try this:
$collection->update(
array('song' => (string)$entry->imname, "artist"=>(string)$entry->imartist),
array('$push' => array("date" => array('$each'=> array((string)$date)),"chartno"=> array('$each'=> array($a))),
));
I have the following data structure in MongoDB:
{ "_id" : ObjectId( "xy" ),
"litter" : [
{ "puppy_name" : "Tom",
"birth_timestamp" : 1353963728 },
{ "puppy_name" : "Ann",
"birth_timestamp" : 1353963997 }
]
}
I have many of these "litter" documents with varying number of puppies. The highter the timestamp number, the younger the puppy is (=born later).
What I would like to do is to retrieve the five youngest puppies from the collection accross all litter documents.
I tried something along
find().sort('litter.birth_timestamp' : -1).limit(5)
to get the the five litters which have the youngest puppies and then to extract the youngest puppy from each litter in the PHP script.
But I am not sure if this will work properly. Any idea on how to do this right (without changing the data structure)?
You can use the new Aggregation Framework in MongoDB 2.2 to achieve this:
<?php
$m = new Mongo();
$collection = $m->selectDB("test")->selectCollection("puppies");
$pipeline = array(
// Create a document stream (one per puppy)
array('$unwind' => '$litter'),
// Sort by birthdate descending
array('$sort' => array (
'litter.birth_timestamp' => -1
)),
// Limit to 5 results
array('$limit' => 5)
);
$results = $collection->aggregate($pipeline);
var_dump($results);
?>
I want to find documents where last elements in an array equals to some value.
Array elements may be accessed by specific array position:
// i.e. comments[0].by == "Abe"
db.example.find( { "comments.0.by" : "Abe" } )
but how do i search using the last item as criteria?
i.e.
db.example.find( { "comments.last.by" : "Abe" } )
By the way, i'm using php
I know this question is old, but I found it on google after answering a similar new question. So I thought this deserved the same treatment.
You can avoid the performance hit of $where by using aggregate instead:
db.example.aggregate([
// Use an index, which $where cannot to narrow down
{$match: { "comments.by": "Abe" }},
// De-normalize the Array
{$unwind: "$comments"},
// The order of the array is maintained, so just look for the $last by _id
{$group: { _id: "$_id", comments: {$last: "$comment"} }},
// Match only where that $last comment by `by.Abe`
{$match: { "comments.by": "Abe" }},
// Retain the original _id order
{$sort: { _id: 1 }}
])
And that should run rings around $where since we were able to narrow down the documents that had a comment by "Abe" in the first place. As warned, $where is going to test every document in the collection and never use an index even if one is there to be used.
Of course, you can also maintain the original document using the technique described here as well, so everything would work just like a find().
Just food for thought for anyone finding this.
Update for Modern MongoDB releases
Modern releases have added the $redact pipeline expression as well as $arrayElemAt ( the latter as of 3.2, so that would be the minimal version here ) which in combination would allow a logical expression to inspect the last element of an array without processing an $unwind stage:
db.example.aggregate([
{ "$match": { "comments.by": "Abe" }},
{ "$redact": {
"$cond": {
"if": {
"$eq": [
{ "$arrayElemAt": [ "$comments.by", -1 ] },
"Abe"
]
},
"then": "$$KEEP",
"else": "$$PRUNE"
}
}}
])
The logic here is done in comparison where $arrayElemAt is getting the last index of the array -1, which is transformed to just an array of the values in the "by" property via $map. This allows comparison of the single value against the required parameter, "Abe".
Or even a bit more modern using $expr for MongoDB 3.6 and greater:
db.example.find({
"comments.by": "Abe",
"$expr": {
"$eq": [
{ "$arrayElemAt": [ "$comments.by", -1 ] },
"Abe"
]
}
})
This would be by far the most performant solution for matching the last element within an array, and actually expected to supersede the usage of $where in most cases and especially here.
You can't do this in one go with this schema design. You can either store the length and do two queries, or store the last comment additionally in another field:
{
'_id': 'foo';
'comments' [
{ 'value': 'comment #1', 'by': 'Ford' },
{ 'value': 'comment #2', 'by': 'Arthur' },
{ 'value': 'comment #3', 'by': 'Zaphod' }
],
'last_comment': {
'value': 'comment #3', 'by': 'Zaphod'
}
}
Sure, you'll be duplicating some data, but atleast you can set this data with $set together with the $push for the comment.
$comment = array(
'value' => 'comment #3',
'by' => 'Zaphod',
);
$collection->update(
array( '_id' => 'foo' ),
array(
'$set' => array( 'last_comment' => $comment ),
'$push' => array( 'comments' => $comment )
)
);
Finding the last one is easy now!
You could do this with a $where operator:
db.example.find({ $where:
'this.comments.length && this.comments[this.comments.length-1].by === "Abe"'
})
The usual slow performance caveats for $where apply. However, you can help with this by including "comments.by": "Abe" in your query:
db.example.find({
"comments.by": "Abe",
$where: 'this.comments.length && this.comments[this.comments.length-1].by === "Abe"'
})
This way, the $where only needs to be evaluated against documents that include comments by Abe and the new term would be able to use an index on "comments.by".
I'm just doing :
db.products.find({'statusHistory.status':'AVAILABLE'},{'statusHistory': {$slice: -1}})
This gets me products for which the last statusHistory item in the array, contains the property status='AVAILABLE' .
I am not sure why my answer above is deleted. I am reposting it. I am pretty sure without changing the schema, you should be able to do it this way.
db.example.find({ "comments:{$slice:-1}.by" : "Abe" }
// ... or
db.example.find({ "comments.by" : "Abe" }
This by default takes the last element in the array.