Project Mongo collection array value under a condition - php

i am using the Mongo php library in a service that is responsible of storing and retrieving some social data, specifically from Facebook, and just a snippet, it goes something like this, a collection of posts insights:
{
"_id":"5865aa8e9bbbe400010f97a2",
"insights":[
{
"name":"post_story_adds_unique",
"period":"lifetime",
"values":[
{
"value":10
}
]
},
{
"name":"post_story_adds",
"period":"lifetime",
"values":[
{
"value":11
}
]
},
{
"name":"post_story_adds_by_action_type_unique",
"period":"lifetime",
"values":[
{
"value":{
"like":10,
"comment":1
}
}
]
},
{
"name":"post_story_adds_by_action_type",
"period":"lifetime",
"values":[
{
"value":{
"like":10,
"comment":1
}
}
]
},
{
"name":"post_impressions_unique",
"period":"lifetime",
"values":[
{
"value":756
}
]
}
]
},
{
"_id":"586d939b9bbbe400010f9f12",
"insights":[
{
"name":"post_story_adds_unique",
"period":"lifetime",
"values":[
{
"value":76
}
]
},
{
"name":"post_story_adds",
"period":"lifetime",
"values":[
{
"value":85
}
]
},
{
"name":"post_story_adds_by_action_type_unique",
"period":"lifetime",
"values":[
{
"value":{
"like":73,
"comment":8,
"share":2
}
}
]
},
{
"name":"post_story_adds_by_action_type",
"period":"lifetime",
"values":[
{
"value":{
"like":74,
"comment":9,
"share":2
}
}
]
},
{
"name":"post_impressions_unique",
"period":"lifetime",
"values":[
{
"value":9162
}
]
}
]
}
We can note that all the posts' sub-documents, i.e. metrics, are present in each post, surely, with their corresponding values.
I believe that the data structure presented above is not optimal to work with; if faced with the situation to project a couple of these sub-documents values, say post_story_adds_by_action_type's and post_impressions_unique's value property, we need to run a condition on all the sub-documents in order to match the name property to post_story_adds_by_action_type or post_impressions_unique.
I tried $elemMatch(projection), but as the documentation says, it only returns the first matching array element. I was able to do so as such:
$this->db->$collection->find([],
[
'projection' => [
'insights' => [
'$elemMatch' => [
'name' => 'post_impressions_unique'
]
],
'_id' => 0,
'insights.values.value' => 1,
],
]);
A MongoDB\Driver\Cursor object is returned with only the desired value, by setting this 'insights.values.value' => 1, of the sub-documents that have name equal to post_impressions_unique.
The first thing i thought about was, of course, to use the $or logical operator:
$this->db->$collection->find([],
[
'projection' => [
'insights' => [
'$elemMatch' => [
'$or' => [
[
'name' => [
'$eq' => 'post_story_adds_by_action_type',
]
],
[
'name' => [
'$eq' => 'post_impressions_unique',
]
]
]
]
]
],
'_id' => 0,
'insights.name' => 1,
'insights.values.value.like' => 1,
'insights.values.value' => 1,
]);
Note the projections 'insights.values.value.share' => 1 'insights.values.value' => 1 corresponding to the different sub-documents value position.
Of course this didn't work, i got an array of post_impressions_unique sub-documents alone; so i had to try the aggregation framework https://docs.mongodb.com/manual/reference/operator/aggregation/filter/ :
$this->db->$name->aggregate(
[
[
'$project' => [
'insights' => [
'$filter' => [
'input' => '$insights',
'as' => 'name',
'cond' => [
'$or' => [
[
'$eq' => [
'name', '$post_story_adds_by_action_type',
]
],
[
'$eq' => [
'name', '$post_impressions_unique',
]
],
],
],
],
],
'_id' => 0,
'values.value.like' => 1,
'values.value' => 1,
],
]
]);
This didn't work either, in this case i got an array of empty insights objects.
I considered using the laravel-mongodb package and take advantage of the Eloquent builder.
DB::collection($collection)->project(['insights' => ['$elemMatch' => ['name' => 'post_story_adds_unique']]])->get();
or
DB::collection($collection)->project(['insights' => [
'$filter' => [
'input' => '$insights',
'as' => 'name',
'cond' => [
'name' => 'post_story_adds_unique',
]
]
],
'values.value.like' => 1
])->get();
But still i couldn't get the value within the sub-document. I checked the Builder::project() function and it seems that it internally use the aggregation framework as well, but i wasn't able to figure out the appropriate syntax to do so, if any.
My questions are as follow:
1- How can i retrieve specific sub-documents' insights.values.value and name properties, where name matches post_impressions_unique? An vice versa, how to retrieve the sub-documents' insignts.values.value.share and name properties when the latter matches post_story_adds_by_action_type?
2- What is the correct syntax to use $filter within an $project(aggregation)?
This is basically most of the research i have been doing, and it feels as if i am running in circles.
Appreciate your help.
Thank you.

You can try something like this.
Use $$ notation to access the variable defined in iteration in $filter and $map.
$map is used to trim the response to display values and name from insights array.
[
[
'$project' => [
'insights' => [
'$map' => [
'input' => [
'$filter' => [
'input' => '$insights',
'as' => 'insightf',
'cond' => [
'$or' => [
[
'$eq' => [
'$$insightf.name', 'post_story_adds_by_action_type',
]
],
[
'$eq' => [
'$$insightf.name', 'post_impressions_unique',
]
],
],
],
],
],
'as' => 'insightm',
'in' => [
'values' => '$$insightm.values.value',
'name' => '$$insightm.name',
],
],
],
'_id' => 0
],
]
];

Related

Search with "And" operator in Elastic search PHP package

I'm trying to learn Elastic Search with help of php composer package. I'm having a index with the name of media_data which contains nits_account, nits_url, session_id, timestamp fields. I want to have filters based on above fields and it should be in and operator. my current query code is:
$items = $this->elasticsearch->search([
'index' => $index_name,
'body' => [
'query' => [
'filtered' => [
'filter' => [
'and' => [
['match' => ['nits_account' => 'xyzABCD2190-aldsj']], //API Key
['match' => ['nits_url' => 'google.com']],
]
]
]
]
]
]);
My question:
I'm unable to fetch data. But if I do below code:
$items = $this->elasticsearch->search([
'index' => $index_name,
'body' => [
'query' => [
'bool' => [
'should' => [
['match' => ['nits_account' => $account] ],
['match' => ['nits_url' => $domain] ],
],
],
]
]
]);
I get values in or operators, but need to have and operation in it.
How can I have different search operations with respective fields, I mean I want to have nits_account field to be exact match, I want to have nits_url with like/wildcard operations, timestamp should be comparable (greater than/less than/between two dates).
Try this:
$items = $this->elasticsearch->search([
'index' => $index_name,
'body' => [
'query' => [
'bool' => [
'must' => [
['match' => ['nits_account' => $account] ],
['match' => ['nits_url' => $domain] ]
],
],
]
]
]);
You should use must keyword, not should keyword. must acts like AND operation while should acts like OR operation.
See this https://stackoverflow.com/a/28768600/5430055
if you need to use conditional match use something like this:
"query": {
"bool": {
"must": [
{
"range": {
"nits_url": {
"gte": 1000,
"lte": 10000
}
}
},
{
"match": {
"nits_account": "$account"
}
}
]
}
}

Elasticsearch [Query Bool Must Match] performs OR operation instead of AND

I'm trying to perform a basic login operation where my view (front end part) accepts a username and password through a form
So in SQL, I must have an example query:
SELECT * FROM users WHERE username = $_POST['username'] AND password = $_POST['password'];
According to the official documentation of Elasticsearch PHP API, it must go like this:
$params = [
'index' => 'myIndex',
'type' => 'myType',
'body' => [
'query' => [
"bool" => [
"must" => [
"match" => [
"username" => 'email#email.com',
],
"match" => [
"password" => 'mypassword',
],
]
]
]
]
];
Unfortunately, it is displaying A LOT of documents so I presumed it's performing the OR operator instead of matching them together
FYI, if you would ever wonder why would there be so many documents displayed according to the "hits" property above, there are literally many user documents with the same password
Main Question
Is there any proper ES query to properly match my username AND password so I could only retrieve one document? I've been searching through with the official documentation, but nothing succeeds the desired output
Thank you very much!
You're almost there. You need to enclose your match queries in one more array, otherwise your bool/must becomes an associative array and that's not what you want (i.e. the second match filter gets discarded).
$params = [
'index' => 'myIndex',
'type' => 'myType',
'body' => [
'query' => [
"bool" => [
"must" => [
--> [
"match" => [
"username" => 'email#email.com',
]
--> ],
--> [
"match" => [
"password" => 'mypassword',
]
]
--> ]
]
]
]
];
With the help of sir Val, I was able to formulate a technique to try work-arounds for my query, and was able to display the result with the following:
$params = [
'index' => $index,
'type' => 'index',
'size' => 250,
'body' => [
'query' => [
'bool' => [
'must' => [
[
"match" => [
"usr_username" =>
[
"query" => $username,
"operator" => "and"
]
]
],
[
"match" => [
"usr_password" => [
"query" => $password,
"operator" => "and"
]
]
]
]
]
]
]
];

PHP expected [END_OBJECT] but got [FIELD_NAME], possibly too many query clauses

Elasticsearch 2.4.5
PHP 7.0
I have the following aggregation query that works via curl but fails when I convert to PHP. I feel like I'm missing something stupid/easy and just looking for another set of eyes
curl -XPOST "http://localhost:9200/_search" -d '
{
"size": 1,
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{"type": { "value": "web_logs" }},
{"range": {
"#timestamp": {
"gte": "2017-10-01T00:00:00.000" ,
"lt": "2017-11-01T00:00:00.000"
}
}}
]
}
}
}
},
"aggs": {
"by_company": {
"terms": {
"field": "company.raw"
},
"aggs": {
"total_bytes": {
"sum": {
"field": "sc_bytes"
}
}
}
}
}
}'
But when I try to convert it to PHP I get an error
<?php
ini_set('display_errors', 0);
use Elasticsearch\ClientBuilder;
require 'vendor/autoload.php';
$hosts = [
'localhost:9200'
];
$client = ClientBuilder::create()
->setHosts($hosts)
->build();
$params['body'] = [
'size' => 1,
'query' => [
'filtered' => [
'filter' => [
'bool' => [
'must' => [
'type' => [
'value' => 'web_logs'
],
'range' => [
'#timestamp' => [
'gte' => '2017-10-01T00:00:00.000',
'lt' => '2017-11-01T00:00:00.000'
]
]
],
]
]
]
],
'aggs' => [
'by_company' => [
'terms' => [
'field' => 'company.raw'
],
'aggs' => [
'total_bytes' => [
'sum' => [
'field' => 'sc_bytes'
]
]
]
]
]
];
$results = $client->search($params);
Here is the error
$ php report2.php
PHP Fatal error: Uncaught Elasticsearch\Common\Exceptions\BadRequest400Exception: {"error":{"root_cause":[{"type":"query_parsing_exception","reason":"expected [END_OBJECT] but got [FIELD_NAME], possibly too many query clauses","index":"logstash-2017.09.11","line":1,"col":92},{"type":"query_parsing_exception","reason":"expected [END_OBJECT] but got [FIELD_NAME], possibly too many query clauses","index":"logstash-2017.09.12","line":1,"col":92},{"type":"query_parsing_exception","reason":"expected [END_OBJECT] but got [FIELD_NAME], possibly too many query clauses","index":"logstash-2017.09.13","line":1,"col":92},{"type":"query_parsing_exception","reason":"expected [END_OBJECT] but got [FIELD_NAME], possibly too many query clauses","index":"logstash-2017.09.14","line":1,"col":92},{"type":"query_parsing_exception","reason":"expected [END_OBJECT] but got [FIELD_NAME], possibly too many query clauses","index":"logstash-2017.09.15","line":1,"col":92},{"type":"query_parsing_exception","reason":"expected [END_OBJECT] but got [FIELD_NAME in /mnt/c/Users/Chris/code/logstash/vendor/elasticsearch/elasticsearch/src/Elasticsearch/Connections/Connection.php on line 610
Your bool/must clause needs to be an array, so you're missing a few angle brackets:
'bool' => [
'must' => [
-> [
'type' => [
'value' => 'web_logs'
]
-> ],
-> [
'range' => [
'#timestamp' => [
'gte' => '2017-10-01T00:00:00.000',
'lt' => '2017-11-01T00:00:00.000'
]
]
-> ]
],
]

parsing_exception: no [query] registered for [filtered]

I am trying to find all the results which is not contain a field 'open_location'. I am using the below code. But it gives me the error for print result. The error is,
parsing_exception: no [query] registered for [filtered]
I have seen this question for my solution,
Best way to check if a field exist in an Elasticsearch document
But
Please help me...
$index_name=$db_name.'_temp_traking';
$para= [
'index' => $index_name,
'type' => $index_name,
'body' => [
'query' => [
'filtered' => [
'filter' => [
'bool' => [
'must_not' => [
'missing' => [
'field' => 'open_location'
]
]
]
]
]
]
]
];
$response = $client->search($para);
The filtered query is deprecated and removed in Elastic 5, which I guess is what you are using. Also, you say you're looking for docs that do NOT contain the field, but your code says it 'must not' be 'missing'.
If you need the field not to exist, try this:
"query": {
"bool": {
"must_not": {
"exists": {
"field": "open_location"
}
}
}
}

PHP API to get Documents

I am using a PHP API to post documents into elastic but I need to retrieve the last document posted according to it's timestamp.
The sense query I am currently using is this:
GET index-*/type/_search
{
"query": {
"match_all": {}
},
"size": 1,
"sort": [
{
"timestamp": {
"order": "desc"
}
}
]
}
I have translated it to my PHP api
$params = [
'index' => 'index-*',
'type' => 'type',
'custom' => [
'query'=> [
'match_all'=> []
],
'size'=> 1,
'sort'=> [
[
'timestamp'=> [
'order'=> 'desc'
]
]
]
]
];
$response = $client->get($params);
But it unfortunately keeps throwing errors and asking for 'id' but my ids are eleastic generated. I cant do it any other way. Is there a way around thsi? Thanks
You need to use the search method:
$params = [
'index' => 'index-*',
'type' => 'type',
'body' => [
'query'=> [
'match_all'=> []
],
'size'=> 1,
'sort'=> [
[
'timestamp'=> [
'order'=> 'desc'
]
]
]
]
];
$response = $client->search($params);

Categories