MapReduce with PHP and MongoDB - php

I'm fairly new to Mongo and I have what I thought was a simple question. How do I do MapReduce with PHP and the non legacy MongoDB driver http://php.net/manual/en/set.mongodb.php or the higher level package mongodb/mongodb found at https://packagist.org/packages/mongodb/mongodb?
Every example I've seen seems to use the legacy driver (http://php.net/manual/en/book.mongo.php). They all use the MongoCode object, which doesn't exist in mongodb.php. It exists in mongo.php (the legacy driver). When I try and use it, it will say that "Class 'MongoCode' not found".
My code looks something like:
$function = "function() { emit(this); }";
$map = new \MongoCode($function);
$command = $db->command([
"mapreduce" => "db.archiveData",
"map" => $map,
"query" => $query,
"out" => "data"
]);
To make things more confusing, when I look at the source at https://github.com/mongodb/mongo-php-library, there is a unit test for MapReduce (https://github.com/mongodb/mongo-php-library/blob/4dc36f6231df133a57ff0dc5a0123945133d25ba/tests/Operation/MapReduceFunctionalTest.php). But it uses the MongoDB\Operation\MapReduce, which doesn't seem to exist in the 1.1 version of mongodb/mongodb.
I thought maybe I would call it on the server using JavaScript. But when I look at http://php.net/manual/en/mongodb.execute.php, it says it "is deprecated in MongoDB 3.0+". So that doesn't feel like something I should use.
So is it that:
MapReduce is not supported with mongodb/mongodb. Or maybe it is not supported yet, but will be?
I have to use the legacy driver for MapReduce?
I have to figure out a way to call db.collection.mapReduce via JavaScript on the server?
I have to use the Aggregation Pipeline (https://docs.mongodb.com/manual/aggregation/) to do map reduce type of actions? But that feels much more limited.
What am I missing?

So I now have clarity on where things are at.
MapReduce will be officially supported in 1.2.0 of PHPLib (https://jira.mongodb.org/browse/PHPLIB-53)
Until then, there is a completely usable workaround by using the command object as per https://docs.mongodb.com/php-library/current/upgrade/#mapreduce-command-helper
Example is here as well:
$database = (new MongoDB\Client)->selectDatabase('db_name');
$cursor = $database->command([
'mapReduce' => 'collection_name',
'map' => new MongoDB\BSON\Javascript('...'),
'reduce' => new MongoDB\BSON\Javascript('...'),
'out' => 'output_collection_name',
]);
$resultDocument = $cursor->toArray()[0];
You can also use MapReduce via Doctrine (http://docs.doctrine-project.org/projects/doctrine-mongodb-odm/en/latest/reference/map-reduce.html), but that is using legacy and a shim. So probably not a good choice for a new project.

Related

Transaction in MongoDB 4.2 with new PHP Driver

I am new to MongoDB as I was a SuperFan of MySQL before. I recently moved to this NoSQL thing and loved it but now I am badly trapped at Transactions in MongoDB.
I found some related questions on SO but with no answers or obsolete which does not work with new MongoDB PHP Driver as there are many changes in syntax/functions and I could see many newbie like me are confused between MongoDB Docs and PHP Driver.
I found this way of committing transactions in MongoDB Docs
$client = new MongoDB\Driver\Manager("mongodb://127.0.0.1:27017");
$callback = function (\MongoDB\Driver\Session $session) use ($client)
{
$client->selectCollection('mydb1', 'foo')->insertOne(['abc' => 1], ['session' => $session]);
$client->selectCollection('mydb2', 'bar')->insertOne(['xyz' => 999], ['session' => $session]);
};
// Step 2: Start a client session.
$session = $client->startSession();
// Step 3: Use with_transaction to start a transaction, execute the callback, and commit
$transactionOptions =
[
'readConcern' => new \MongoDB\Driver\ReadConcern(\MongoDB\Driver\ReadConcern::LOCAL),
'writeConcern' => new \MongoDB\Driver\WriteConcern(\MongoDB\Driver\WriteConcern::MAJORITY, 1000),
'readPreference' => new \MongoDB\Driver\ReadPreference(\MongoDB\Driver\ReadPreference::RP_PRIMARY),
];
\MongoDB\with_transaction($session, $callback, $transactionOptions);
but this syntax/functions are obsolete for new PHP Driver and it gives following error
Call to undefined function MongoDB\with_transaction()
According to PHP Docs, the new PHP Driver for MongoDB provides these options to commit transaction but I don't understand how? because there is no example given in docs.
https://www.php.net/manual/en/mongodb-driver-manager.startsession.php
https://www.php.net/manual/en/mongodb-driver-session.starttransaction.php
https://www.php.net/manual/en/mongodb-driver-session.committransaction.php
My Question is, How can I update the above code with New PHP Driver's functions? I believe to use
MongoDB\Driver\Manager::startSession
MongoDB\Driver\Session::startTransaction
MongoDB\Driver\Session::commitTransaction
but I don't understand what their syntax is or their arguments etc because of incomplete documentation and no examples. Thanking you in anticipation for your time and support.
Ok, So, I found the answer to my question and I thought it can be helpful for some others
using Core Mongo Extension
$connection = new MongoDB\Driver\Manager("mongodb://127.0.0.1:27017");
$session = $connection->startSession();
$session->startTransaction();
$bulk = new MongoDB\Driver\BulkWrite(['ordered' => true]);
$bulk->insert(['x' => 1]);
$bulk->insert(['x' => 2]);
$bulk->insert(['x' => 3]);
$result = $connection->executeBulkWrite('db.users', $bulk, ['session' => $session]);
$session->commitTransaction();
using PHP Library
$session = $client->startSession();
$session->startTransaction();
try {
// Perform actions.
//insertOne(['abc' => 1], ['session' => $session]); <- Note Session
$session->commitTransaction();
} catch(Exception $e) {
$session->abortTransaction();
}
Note: To make the answer short and to the point, I have omitted some of the optional parameters and used a dummy insert data etc without any try-catch.
If you are running MongoDB instance as standalone version that is for development or testing purpose then you might get error something like
transaction numbers are only allowed on a replica set member or mongos
Then you can enable Replica on a standalone instance following this guide https://docs.mongodb.com/manual/tutorial/convert-standalone-to-replica-set/

How to make aggregation, using PHP MongoDB driver

I use this MongoDb driver in my application. I like it, because it works well. I can run simple select and insert statements, like:
$em = new MongoDB\Driver\Manager("mongodb://localhost:27017/testdb");
$query = new Query(['name' => 'John']);
$res = $em->executeQuery('users', $query);
But the problem is, I can not find even a single example on making aggregations. PHP documentation does not say a word about this. While, MongoDb documentation seems to use another library:
$collection = (new MongoDB\Client)->test->restaurants;
$cursor = $collection->find([
'name' => new MongoDB\BSON\Regex('^' . preg_quote('(Library)')),
]);
It seems like an example from another library, because MongoDb driver does not have this MongoDB\Client class. There is such a class, but in a deprecated library. So, what is the right way to make aggregations in PHP, using modern MongoDb driver?
This extension provides a minimal API for core driver functionality: commands, queries, writes, connection management, and BSON serialization.
https://www.php.net/manual/en/set.mongodb.php
To execute aggregate command, use MongoDB\Driver\Command
{
aggregate: "<collection>" || 1,
pipeline: [ <stage>, <...> ],
explain: <boolean>,
allowDiskUse: <boolean>,
cursor: <document>,
maxTimeMS: <int>,
bypassDocumentValidation: <boolean>,
readConcern: <document>,
collation: <document>,
hint: <string or document>,
comment: <string>,
writeConcern: <document>
}
https://docs.mongodb.com/manual/reference/command/aggregate/#syntax

How can I integrate elasticsearch to mysql using php

I was working on a project which is required to use elasticsearch. I followed the guide: https://www.elastic.co/guide/en/elasticsearch/client/php-api/current/index.html
It works perfectly for me:
require 'vendor/autoload.php';
use Elasticsearch\ClientBuilder;
$hosts = [
'myhost'
];
$client = ClientBuilder::create() // Instantiate a new ClientBuilder
->setHosts($hosts) // Set the hosts
->build();
$params = [
'index' => 'php-demo-index',
'type' => 'doc',
'id' => 'my_id',
'body' => ['testField' => 'abc']
];
$response = $client->index($params);
print_r($response);
Now, that's only a basic thing. Now, what I want is to integrate this with Mysql i.e. as I update or insert into my table in database, it get indexed automatically in elasticsearch.
I know, we have Logstash that can query db constantly after a given interval and index into elasticsearch. But, I want indexing to be happened automatically after insertion into db using PHP without logstash.
I know such a library in (nodeJs+mongodb) ie. mongoosastics: https://www.npmjs.com/package/mongoosastic. Is there any library available in php which can do such a task automatically. Please provide me the sample code, if you know one.
There is indeed libraries to automate this task. However it generally requires the use of an ORM like Doctrine in order gracefully hook in to your database implementation. If you are able to use the Symfony framework in your project there is a library called FOSElasticaBundle which keeps your indices in sync with your database operations.

How to create MongoDB views in PHP using MongoDB\Client() library?

I'm using following:
PHP 7.2
MongoDB 3.4
Pecl 1.5.2
I'm working on a Laravel project. It uses MongoDB as database. I have few collections on which I have to create Mongo Views using Laravel migration. I was wondering whether its possible to create Mongodb Views using PHP. Currently I have a work around. I have created a JavaScript file which has MongoDB db.createView() query in it. It also takes view name and collection name as parameters. Following is my work around. $db has database name, $view has view name, $collection has collection name and $script has the path to the JavaScript file. This code I'm writing in migration class's up() method.
$cmd = "mongo $db --eval \"var view='$view', collection='$collection'\" $script";
exec($cmd);
In my Javascript file, I have code something like following
db.createView(view, collection, <aggregate query>);
So as everyone can see, I'm running terminal command from PHP to make views. So is there any PHP function in mongo library to make mongo views?
If you're using mongo with Laravel, I'm going to assume you're using jenssegers/mongodb to use it with Eloquent.
So, let's assume you have your mongo database set up as your 'mongodb' database connection. You need the MongoDB\Database for your database. You can get this with:
$mongo = app('db')->connection('mongodb')->getMongoDB();
Of course, if you're not using jenssegers/mongodb, you can still do the same thing with mongodb/mongodb as well.
$mongo = (new MongoDB\Client)->selectDatabase($db);
This has a method called command (see https://docs.mongodb.com/php-library/current/reference/method/MongoDBDatabase-command/), which corresponds to the db.runCommand method from the mongo cli. db.createView calls that method (see https://docs.mongodb.com/manual/reference/method/db.createView/#db.createView)
So, you can use $mongo->command to create the view like this:
$mongo->command([
'create' => $view,
'viewOn' => $collection,
'pipeline' => $aggregateQuery,
'collation' => ['locale' => 'en'],
]);
You can use this library mongoPhpLibrary
This will make your work easy

All requests requiring connection to mysql are very very slow (using Phalcon)

I've been working on converting an application of mine from CodeIgniter to Phalcon. I've noticed that [query heavy] requests that only took a maximum of 3 or 4 seconds using CI are taking up to 30 seconds to complete using Phalcon!
I've spent days trying to find a solution. I've tried using all the different means of access offered by the framework including submitting raw query strings directly to Phalcon's MySql PDO adapter.
I'm adding my database connection to the service container exactly like it is shown in Phalcon's INVO tutorial:
$di->set('db', function() use ($config) {
return new \Phalcon\Db\Adapter\Pdo\Mysql(array(
"host" => $config->database->host,
"username" => $config->database->username,
"password" => $config->database->password,
"dbname" => $config->database->name
));
});
Using webgrind output I've been able to narrow the bottleneck down to the constructor in Phalcon's PDO adapter class (cost is in milliseconds):
I've already profiled and manually tested the relevant SQL to make sure the bottleneck isn't in the database (or my poorly constructed SQL!)
I've discovered the problem, which to me wasn't immediately apparent, so hopefully others will find this useful as well.
Every time a new query was started, the application was getting a new instance of the database adapter. The request which produced the webgrind output above had a total of 20 queries.
While re-reading Phalcon's documentation section on dependency injection I saw that services can optionally be added to the service container as a "shared" service, which effectively forces the object to act as a singleton, meaning that once one instance of the class is created, the application will simply pass that instance to any request instead of creating a new instance.
There are several methods to force a service to be added as a shared service, details of which can be found here in Phalcon's Documentation:
http://docs.phalconphp.com/en/latest/reference/di.html#shared-services
Changing the code posted above to be added as a shared service looks like this:
$di->setShared('db', function() use ($config) {
return new \Phalcon\Db\Adapter\Pdo\Mysql(array(
"host" => $config->database->host,
"username" => $config->database->username,
"password" => $config->database->password,
"dbname" => $config->database->name
));
});
Here's what the webgrind output looks like for the same query referenced above, but after setting the database service to be added as a shared service (cost in milliseconds):
Notice that the invocation count is now 1 instead of 20, and the invocation cost dropped from 20 seconds down to 1 second!
I hope someone else finds this useful!
In most examples services are shared as de facto, not in the most apparent way though, but via:
$di->set('service', …, true);
The last bool argument passed to the set makes it shared and in 99.9% you'd want your DI services to be that way, otherwise similar things would happen as described by #the-notable, but because they are likely to be not as "impactful", they would be hard to trace down.

Categories