I have lots of records in a collection. They are indexed with a binary uuid :
db.users.find().limit(1);
[{ "_id": ..., "guid": BinData(2,"EAAAANR56IodpE3xhYLtfugc7SY="), otherdata }]
If I query from the CLI, I can retrieve the records:
db.users.find({ "guid": BinData(2,"EAAAANR56IodpE3xhYLtfugc7SY=") });
[{ "_id": ..., "guid": BinData(2,"EAAAANR56IodpE3xhYLtfugc7SY="), otherdata }]
If I want to do the same thing from PHP, the query returns nothing:
$client->db->setProfilingLevel(2);
$res = $client->db->users->find(array('guid' => new MongoBinData($bin_data, 2)));
$res->next(); // perform the query
echo $res->count(); // display 0
Of course I have tested the $bin_data variable hold the right value.
If I look at mongo's logs,
Thu Jan 10 18:05:06 [conn1] query db.users query: { guid: BinData } ntoreturn:0 ntoskip:0 nscanned:0 keyUpdates:0 locks(micros) r:14807 nreturned:0 reslen:20 14ms
the scanned value is 0 ! This means it does not even scan the collection for results ?
Any clue ?
Edit I have set the profiling level to 2, it does change anything. I can see authentication in the logs but still no query.
Edit2 Added log line.
I must admit I am confused myself. I ran:
$mongo = new Mongo();
$db = $mongo->mydb;
$col = $db->gjgjgj->insert(
array( "guid" => new MongoBinData("EAAAANR56IodpE3xhYLtfugc7SY=", 2) )
);
var_dump($col);
$cur = $db->gjgjgj->find(
array( "guid" => new MongoBinData("EAAAANR56IodpE3xhYLtfugc7SY=", 2) )
);
var_dump(iterator_to_array($cur));
And my output was:
boolean true
array
'50ef0bc96803fa0b04000000' =>
array
'_id' =>
object(MongoId)[7]
public '$id' => string '50ef0bc96803fa0b04000000' (length=24)
'guid' =>
object(MongoBinData)[9]
public 'bin' => string 'EAAAANR56IodpE3xhYLtfugc7SY=' (length=28)
public 'type' => int 2
Can you tell us what PHP driver version this is?
Can you also post your entire script that can reproduce this?
Edit
Of course running count() after running a next() on the cursor works for me too.
Mongo doesn't log all queries by default. Only slow queries (usually taking more than 100 milliseconds) are logged. Also check out docs on profiling.
You can use mongoDB profiling from php to see what is happining there
http://php.net/manual/en/mongodb.setprofilinglevel.php
$this->command(array('profile' => 2))
Related
In my MongoDB collection, all documents contain a mileage field which currently is a string. Using PHP, I'd like to add a second field which contains the same content, but as an integer value. Questions like How to change the type of a field? contain custom MongoDB code which I don't want to run using PHP, and questions like mongodb php Strings to float values retrieve all documents and loop over them.
Is there any way to use \MongoDB\Operation\UpdateMany for this, as this would put all the work to the database level? I've already tried this for static values (like: add the same string to all documents), but struggle with getting the data to be inserted from the collection itself.
Some further hints:
I'm looking for a pure PHP solution that does not rely on any binary to be called using exec. This should avoid installing more packages than needed on the PHP server
Currently, I have to use MongoDB in v4.0. Yes, that's not the most recent version, but I'm not in the position to perform an upgrade
Try this, please:
01) MongoDB Aggregate reference:
db.collectionName.aggregate(
[
{ "$addFields": {
"intField": { "$toInt": "$stringFieldName" }
}},
{ "$out": "collectionName" }
]
)
02) Possible PHP solution (Using as reference https://www.php.net/manual/en/mongocollection.aggregate.php):
$pipeline = array(
array(
'$addFields' => array(
'integerField' => array('$toInt' => '$mileage')
)
),
array(
'$out' => 'collection'
),
);
$updateResult = $collection->aggregate(pipeline);
You could use $set like this in 4.2 which supports aggregation pipeline in update.
$set stage creates a mileageasint based on the previous with $toInt value
db.collection.updateMany(
<query>,
[{ $set: { "mileageasint":{"$toInt":"$mileage" }}}],
...
)
Php Solution ( Using example from here)
$updateResult = $collection->updateMany(
[],
[['$set' => [ 'mileageasint' => [ '$toInt' => '$mileage']]]]
);
I have a field called "arrivalDate" and this field is a string. Each document has an arrivalDate in string format (ex: 20110128). I want my output to be something like this (date and the number of records that have that date):
Date : how many records have that date
20110105 : 5 records
20120501 : 2 records
20120602 : 15 records
I already have the query to get these results.
I am trying to display aggregated results in PHP from Elasticsearch. I want my output to be something like this:
Date : how many records have that date
20110105 : 5 records
20120501 : 2 records
20120602 : 15 records
This is what I have so far:
$json = '{"aggs": { "group_by_date": { "terms": { "field": "arrivalDate" } } } }';
$params = [
'index' => 'pickups',
'type' => 'external',
'body' => $json
];
$results = $es->search($params);
However, I don't know how to display the results in PHP. For example, if I wanted to display the total number of documents I would do echo $results['hits']['total'] How could I display all the dates with the number of records they have in PHP?
I'd suggest using aggregations in the same way you construct the query, from my experience it seems to work quicker. Please see the below code:
'aggs' => [
'group_by_date' => [
'terms' => [
'field' => 'arrivalDate',
'size' => 500
]
]
]
Following that, instead of using the typical results['hits']['hits'] you would switch out the hits parts to results['aggregations']. Then access the returning data by accessing the buckets in the response.
For accessing the data from the aggregation shown above, it would likely be something along the lines of:
foreach ($results as $result){
foreach($result['buckets'] as $record){
echo($record['key']);
}
}
There will be a better way of accessing the array within the array, however, the above loop system works well for me. If you have any issues with accessing the data, let me know.
We are trying to search a dynamodb, and need to get count of objects within a grouping, how can this be done?
I have tried this, but when adding the second number, this doesn't work:
$search = array(
'TableName' => 'dev_adsite_rating',
'Select' => 'COUNT',
'KeyConditions' => array(
'ad_id' => array(
'ComparisonOperator' => 'EQ',
'AttributeValueList' => array(
array('N' => 1039722, 'N' => 1480)
)
)
)
);
$response = $client->query($search);
The sql version would look something like this:
select ad_id, count(*)
from dev_adsite_rating
where ad_id in(1039722, 1480)
group by ad_id;
So, is there a way for us to achieve this? I can not find anything on it.
Trying to perform a query like this on DynamoDB is slightly trickier than in an SQL world. To perform something like this, you'll need to consider a few things
EQ ONLY Hash Key: To perform this kind of query, you'll need to make two queries (i.e. ad_id EQ 1039722 / ad_id EQ 1480)
Paginate through query: Because dynamodb returns your result set in increments, you'll need to paginate through your results. Learn more here.
Running "Count": You can take the "Count" property from the response and add it to the running total as you're paginating through the results of both queries. Query API
You could add a Lambda function triggered by the DynamoDBStream, to aggregate your data on the fly, in your case add +1 to the relevant counters. Your search function would then simply retrieve the aggregated data directly.
Example: if you have a weekly online voting system where you need to store each vote (also to check that no user votes twice), you could aggregate the votes on the fly using something like this:
export const handler: DynamoDBStreamHandler = async (event: DynamoDBStreamEvent) => {
await Promise.all(event.Records.map(async record => {
if (record.dynamodb?.NewImage?.vote?.S && record.dynamodb?.NewImage?.week?.S) {
await addVoteToResults(record.dynamodb.NewImage.vote.S, record.dynamodb.NewImage.week.S)
}
}))
}
where addVoteToResults is something like:
export const addVoteToResults = async (vote: string, week: string) => {
await dynamoDbClient.update({
TableName: 'table_name',
Key: { week: week },
UpdateExpression: 'add #vote :inc',
ExpressionAttributeNames: {
'#vote': vote
},
ExpressionAttributeValues: {
':inc': 1
}
}).promise();
}
Afterwards, when the voting is closed, you can retrieve the aggregated votes per week with a single get statement. This solution also helps spreading the write/read load rather than having a huge increase when executing your search function.
I have a collection in my Mongo Database called WorkOrder with 2 fields DateComplete and DateDue. Using those 2 fields I'd like to use the aggregation framework to count the number of 'Late' Work Orders by comparing the two fields. However the research I've found hasn't had any useful ways to format the query so that the 'Late' Work Orders will be filtered through. Does anyone know of a way to format a Mongo DB Aggregation Query (preferably in PHP) that can compare 2 fields in the collection?
EDIT:
An example entry in WorkOrder might look like
_id
some mongo id
DateDue
2014-10-10
DateCompleted
2014-10-12
This entry would want to be filtered through since DateCompleted is greater than DateDue. I didn't know about the $cond operator so I haven't tried anything for that yet.
EDIT:
After trying #BatScream's suggestion with the following query in my PHP script
array(
'$cond' => array(
'if' => array(
'dateDue' => array(
'$lt' => 'dateComplete
)
)
)
)
However the MongoCollection::Aggregate function told me that $cond wasn't a recognized operator.
EDIT: #BatScream's answer seems to work but I wasn't aware of the fact that the group operator doesn't work properly after a $project is applied. I was hoping to be able to group these document on another field cID, is that possible?
The below aggregation pipeline would give you the result, considering your fields are of ISODate type. If not i suggest you to store them as ISODate type and not Strings.
db.collection.aggregate([
{$project:{"isLateWorkOrder":{$cond:[{$lt:["$dateDue","$dateCompleted"]},
true,false]}}},
{$match:{"isLateWorkOrder":true}},
{$group:{"_id":null,"lateWorkOrders":{$sum:1}}},
{$project:{"_id":0,"lateWorkOrders":1}}
])
The PHP syntax should look similar to,
$projA = array("isLateWorkOrder" =>
array("$cond" =>
array(array("$lt" =>
array("$dateDue","$dateCompleted")),
true,false)))
$matchA = array("isLateWorkOrder" => true)
$grp = array("_id" => null,"lateWorkOrders" => array("$sum" => 1))
$projB = array("_id" => 0,"lateWorkOrders" => 1)
$pipeline = array($projA,$matchA,$grp,$projB);
$someCol -> aggregate($pipeline)
or, simply using the count function:
db.collection.count({$where:"this.dateDue < this.dateCompleted"})
I'm creating some analytics script using PHP and MongoDB and I am a bit stuck. I would like to get the unique number of visitors per day within a certain time frame.
{
"_id": ObjectId("523768039b7e7a1505000000"),
"ipAddress": "127.0.0.1",
"pageId": ObjectId("522f80f59b7e7a0f2b000000"),
"uniqueVisitorId": "0445905a-4015-4b70-a8ef-b339ab7836f1",
"recordedTime": ISODate("2013-09-16T20:20:19.0Z")
}
The field to filter on is uniqueVisitorId and recordedTime.
I've created a database object in PHP that I initialise and it makes me a database connection when the object is constructed, then I have MongoDB php functions simply mapped to public function using the database connection created on object construction.
Anyhow, so far I get the number of visitors per day with:
public function GetUniqueVisitorsDiagram() {
// MAP
$map = new MongoCode('function() {
day = new Date(Date.UTC(this.recordedTime.getFullYear(), this.recordedTime.getMonth(), this.recordedTime.getDate()));
emit({day: day, uniqueVisitorId:this.uniqueVisitorId},{count:1});
}');
// REDUCE
$reduce = new MongoCode("function(key, values) {
var count = 0;
values.forEach(function(v) {
count += v['count'];
});
return {count: count};
}");
// STATS
$stats = $this->database->Command(array(
'mapreduce' => 'statistics',
'map' => $map,
'reduce' => $reduce,
"query" => array(
"recordedTime" =>
array(
'$gte' => $this->startDate,
'$lte' => $this->endDate
)
),
"out" => array(
"inline" => 1
)
));
return $stats;
}
How would I filter this data correctly to get unique visitors? Or would it better to use aggregation, if so could you be so kind to help me out with a code snippet?
The $group operator in the aggregation framework was designed for exactly this use case and will likely be ~10 to 100 times faster. Read up on the group operator here: http://docs.mongodb.org/manual/reference/aggregation/group/
And the php driver implementation here: http://php.net/manual/en/mongocollection.aggregate.php
You can combine the $group operator with other operators to further limit your aggregations. It's probably best you do some reading up on the framework yourself to better understand what's happening, so I'm not going to post a complete example for you.
$m=new MongoClient();
$db=$m->super_test;
$db->gjgjgjg->insert(array(
"ipAddress" => "127.0.0.1",
"pageId" => new MongoId("522f80f59b7e7a0f2b000000"),
"uniqueVisitorId" => "0445905a-4015-4b70-a8ef-b339ab7836f1",
"recordedTime" => new MongoDate(strtotime("2013-09-16T20:20:19.0Z"))
));
var_dump($db->gjgjgjg->find(array('recordedTime'=>array('$lte'=>new MongoDate(),'$gte'=>new MongoDate(strtotime('-1 week')))))->count()); // Prints 1
$res=$db->gjgjgjg->aggregate(array(
array('$match'=>array('recordedTime'=>array('$lte'=>new MongoDate(),'$gte'=>new MongoDate(strtotime('-1 week'))),'uniqueVisitorId'=>array('$ne'=>null))),
array('$project'=>array('day'=>array('$dayOfMonth'=>'$recordedTime'),'month'=>array('$month'=>'$recordedTime'),'year'=>array('$year'=>'$recordedTime'))),
array('$group'=>array('_id'=>array('day'=>'$day','month'=>'$month','year'=>'$year'), 'c'=>array('$sum'=>1)))
));
var_dump($res['result']);
To answer the question entirely:
$m=new MongoClient();
$db=$m->super_test;
$db->gjgjgjg->insert(array(
"ipAddress" => "127.0.0.1",
"pageId" => new MongoId("522f80f59b7e7a0f2b000000"),
"uniqueVisitorId" => "0445905a-4015-4b70-a8ef-b339ab7836f1",
"recordedTime" => new MongoDate(strtotime("2013-09-16T20:20:19.0Z"))
));
var_dump($db->gjgjgjg->find(array('recordedTime'=>array('$lte'=>new MongoDate(),'$gte'=>new MongoDate(strtotime('-1 week')))))->count()); // Prints 1
$res=$db->gjgjgjg->aggregate(array(
array('$match'=>array('recordedTime'=>array('$lte'=>new MongoDate(),'$gte'=>new MongoDate(strtotime('-1 week'))),'uniqueVisitorId'=>array('$ne'=>null))),
array('$project'=>array('day'=>array('$dayOfMonth'=>'$recordedTime'),'month'=>array('$month'=>'$recordedTime'),'year'=>array('$year'=>'$recordedTime'))),
array('$group'=>array('_id'=>array('day'=>'$day','month'=>'$month','year'=>'$year','v'=>'$uniqueVisitorId'), 'c'=>array('$sum'=>1))),
array('$group'=>array('_id'=>array('day'=>'$_id.day','month'=>'$_id.month','year'=>'$_id.year'),'c'=>array('$sum'=>1)))
));
var_dump($res['result']);
Something close to that is what your looking for I believe.
It will reutrn a set of documents that have the _id as the date and then the count of unique visitors for that day irresptive of the of the id, it simply detects only if the id is there.
Since you want it per day you can actually exchange the dat parts for just one field of $dayOfYear I reckon.