I have a nested Document which have some language dependent content and I want to search if content have data for specific language and query should return me content else false.
I tried this query option
$data = $collection->findOne(array('original'=>'What is this', 'translation.language'=>'english') );
I am expecting this result:
{
"language": "english",
"quote": "What is this"
}
but above query return both language content. Can anyone please share some code also regarding saving and updating data using PHP
My collection:
{
"_id": ObjectId("56a8844bc56760810e483815"),
"language": "english",
"original_key": "What is this",
"translation": [
{
"language": "english",
"quote": "What is this"
},
{
"language": "spanish",
"quote": "What is this Spanish"
}
]
}
Use the positional $ operator in the projection document of the findOne() method when you only need one particular array element in selected documents:
// search criteria for nested array
$query = array(
'original' => 'What is this',
'translation.language' => 'english'
);
// projection (fields to include)
$projection = array('_id' => false, 'translation.$' => true);
$result = $collection->findOne($query, $projection);
$data = $result->translation;
var_dump($data);
I think you should use mongodb's aggregation pipeline.
This may work. Not tested.
db.collection.aggregate([
{$unwind :translation},
{$project: {original_key:true,language:true, quote:true}},
{$match:{original_key:"What is this", language:"english"}}
]);
Related
I have a collection like this
{
"name": "Sai Darshan"
}
{
"name": "Sathya"
}
{
"name": "Richie"
}
I want to match the documents with the name "Sathya" and "Richie".
How can I achieve this using $match.
I currently tried this
$db = $this->dbMongo->selectDB("userData");
$collection = $db->selectCollection("userObject");
$aggregationFields = [
[
'$match' => [
'name'=> 'Sathya',
'name'=> 'Richie',
]
]
];
$cursor = $collection->aggregate($aggregationFields)->toArray();
Currently I am getting only the document
{
"name": "Richie"
}
I am expecting to fetch both documents i.e. the documents with the name "Sathya" and "Richie".
I expect to do this with $match itself because I have further pipelines I want to pass this data to.
Is there anyway I can achieve this?.
Any help is appreciated.
Thank you.
#nimrod serok answered in the comments, which is to use the $in operator.
What is probably happening in with the query in the description is that the driver is de-duplicating the name entry. So the query that the database receives only includes the filter for 'name'=> 'Richie'. You can see some reference to that here in the documentation, and javascript itself will also demonstrate this behavior:
> filter = { name: 'Sathya', name: 'Richie' };
{ name: 'Richie' }
>
I implemented elasticsearch using php for binary documents (fscrawler). It works just fine with the default settings. I can search the documents for the word I want and I get results that are case insensitive. However, I now want to do exact matches i.e on top of the current search, if the query is enclosed in quotes, I want to get results that only match the query exactly.. even case sensitive.
My mapping looks like this:
"settings": {
"number_of_shards": 1,
"index.mapping.total_fields.limit": 2000,
"analysis": {
"analyzer": {
"fscrawler_path": {
"tokenizer": "fscrawler_path"
}
},
"tokenizer": {
"fscrawler_path": {
"type": "path_hierarchy"
}
}
}
.
.
.
"content": {
"type": "text",
"index": true
},
My query for the documents looks like this:
if ($q2 == '') {
$params = [
'index' => 'trial2',
'body' => [
'query' => [
'match_phrase' => [
'content' => $q
]
]
]
];
$query = $client->search($params);
$data['q'] = $q;
}
For exact matches(does not work):
if ($q2 == '') {
$params = [
'index' => 'trial2',
'body' => [
'query' => [
'filter' =>[
'term' => [
'content' => $q
]
]
]
]
];
$query = $client->search($params);
$data['q'] = $q;
}
content field is the body of the document. How do I implement the exact match for a specific word or phrase in the content field?
Your content field, what I understand, would be significantly large as many documents may be more than 2-3 MB and that's a lot of words.
There'd be no point in using keyword field in order to do exact match as per the answer to your earlier question where I referred to using keyword. You should use keyword datatype for exact match only if your data is structured
What I understand is the content field you have is unstructured. In that case you would want to make use of Whitespace Analyzer on your content field.
Also for exact phrase match you make take a look at Match Phrase query.
Below is a sample index, documents and queries that would suffice your use case.
Mapping:
PUT mycontent_index
{
"mappings": {
"properties": {
"content":{
"type":"text",
"analyzer": "whitespace" <----- Note this
}
}
}
}
Sample Documents:
POST mycontent_index/_doc/1
{
"content": """
There is no pain you are receding
A distant ship smoke on the horizon
You are only coming through in waves
Your lips move but I can't hear what you're saying
"""
}
POST mycontent_index/_doc/2
{
"content": """
there is no pain you are receding
a distant ship smoke on the horizon
you are only coming through in waves
your lips move but I can't hear what you're saying
"""
}
Phrase Match:(To search a sentence with words in order)
POST mycontent_index/_search
{
"query": {
"bool": {
"must": [
{
"match_phrase": { <---- Note this for phrase match
"content": "There is no pain"
}
}
]
}
}
}
Match Query:
POST mycontent_index/_search
{
"query": {
"bool": {
"must": [
{
"match": { <---- Use this for token based search
"content": "there"
}
}
]
}
}
}
Note that your response should be accordingly.
For exact match for a word, just use a simple Match query.
Note that when you do not specify any analyzer, ES by default uses Standard Analyzer and this would cause all the tokens to be converted into lower case before storing them in Inverted Index. However, Whitespace Analyzer would not convert the tokens into lower case. As a result There and there are stored as two different tokens in your ES index.
I'm assuming you are aware of Analysis and Analyzer concepts and if not I'd suggest you to go through the links as that will help you know more on what I'm talking about.
Updated Answer:
Post understanding your requirements, there is no way you can apply multiple analyzers on a single field, so basically you have two options:
Option 1: Use multiple indexes
Option 2: Use multi-field in your mapping as shown below:
That way, your script or service layer would have the logic of pushing to different index or field depending on your input value(ones having double inverted comma and ones that are simple tokens)
Multi Field Mapping:
PUT <your_index_name>
{
"mappings":{
"properties":{
"content":{
"type":"text", <--- Field with standard analyzer
"fields":{
"whitespace":{
"type":"text", <--- Field with whitespace
"analyzer":"whitespace"
}
}
}
}
}
}
Ideally, I would prefer to have the first solution i.e making use of multiple indexes with different mapping, however I would strongly advise you to revisit your use-case because it doesn't make sense in managing querying like this but again its your call.
Note: A cluster of single node that's the worst possible option you can ever do and specially not for Production.
I'd suggest you ask that in separate question detailing your docs count, growth rate over next 5 years or something and would your use case be more read heavy or write intensive? Is that cluster something other teams may also would want to leverage. I'd suggest you to read more and discuss with your team or manager to get more clarity on your scenarios.
Hope this helps.
I have this data coming from an API, the tmdb api, and I want to create an api to filter the data and expose just the properties I need. In this case I just want my endpoint to expose id, title, and overview and nothing else. How can filter these properties with PHP in order to get and expose just the properties I need?
"results:[{
"vote_count": 779,
"id": 420817,
"video": false,
"vote_average": 7.2,
"title": "Aladdin",
"popularity": 476.676,
"poster_path": "/3iYQTLGoy7QnjcUYRJy4YrAgGvp.jpg",
"original_language": "en",
"original_title": "Aladdin",
"genre_ids": [
12,
14,
10749,
35,
10751
],
"backdrop_path": "/v4yVTbbl8dE1UP2dWu5CLyaXOku.jpg",
"adult": false,
"overview": "A kindhearted street..."
}],
The expected result should be:
myapi.com/api
"results:[{
"id": 420817,
"title": "Aladdin",
"overview": "A kindhearted street ..."
}],
$json = json_decode($data);
$results = $json->results;
$results= array_map(function($r){
return ["id" => $r->id, "title" => $r->title, "overview" => $r->overview]
}, $results)
echo json_encode($results);
You can filter the array like this:
function myFilter($result) {
$filtered_array = array();
foreach($result as $movie) {
$filtered_movie = array();
$filtered_movie['id'] = $movie['id'];
$filtered_movie['title'] = $movie['title'];
$filtered_movie['overview'] = $movie['overview'];
array_push($filtered_array, $filtered_movie);
}
return $filtered_array;
}
Just pass the unfiltered array to the function and you'll get the filtered one back.
EDIT:
If your data is in JSON format you will need to use json_decode() to turn it in to an array in PHP. And if you want to return it as JSON then use json_encode() for that.
I am trying to write a query to search for a products on two columns called category1 and category2. I am working using elastic search php client and tried with match should query but this giving me wrong results because of match of substring.
But i am looking for exact match with OR operation on two columns. I am new to this please guide me.
$params['index'] = 'furnit';
$params['type'] = 'products';
$params['body']['query']['bool']['should'] = array(
array('match' => array('category1' => $category->name)),
array('match' => array('category2' => $category->name)),
);
$results = $this->elasticsearch->search($params);
If you are not searching then using a bool query in this scenario is not the right way to do it in elasticsearch. Queries are used when you are searching something and relevancy of your search keyword and score of matching documents matters.
Here you can apply a bool filter of elasticsearch to filter out the desired results. Using filters with queries (filtered query) is right way to do it as it excludes all non-matching documents and then you can search for desired documents by using match queries.
here's an example of a bool filter
{
"from": 0,
"size": 50,
"sort": [
{
"name" : {
"order": "asc"
}
}
],
"query": {
"filtered": {
"query": {
"match_all" : {}
},
"filter": {
"bool": {
"should": [
{
"term": {
"category1" : "category1"
}
},
{
"term": {
"category2" : "category2"
}
}
]
}
}
}
}
}
you can refer to docs as well (https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-filter.html)
Maybe your problem is you have used default analyzer (which is standard analyzer).
could you give me your mapping ?
I suggest you to change to use not_analyzer when indexing and use term filter/query.
You could use put mapping here to setting for your analyzer: Put Mapping
Edit: I have created a gist for you, check it here:
Mappings & Terms Filter
Hi i'm really mongodb newbie.
I have a document like this:
{
"_id": ObjectId("53182e32e4b0feedb1dea751"),
"solutions": [
[
{
"solution": "Double Room Economy (Without Breakfast)",
"board": "Room Only",
"id": "HK-15501871",
"price": 5000,
"available": "1",
"CXL": "[]",
"unique": 0
},
{
"solution": "Double Room Economy (With Breakfast)",
"board": "Room Only",
"id": "HK-15501871",
"price": 4600,
"available": "1",
"CXL": "[]",
"unique": 1
},
{
"solution": "Double Room Economy (Room Only)",
"board": "Room Only",
"id": "HK-15501871",
"price": 5500,
"available": "1",
"CXL": "[]",
"unique": 2
}
]
]
}
And i need to update the field CXL inside the second array of solutions.
so solutions.1.CXL
This is how i take document:
$collection = $this->getCollection();
$query = array("_id"=>new MongoId($id));
$document = $collection->findOne($query);
now i need to update that field without touch the other.
How can i do?
Thanks!
SOLVED THANKS TO #Sammaye
i solved in this way:
$collection->update(
array('_id' => new MongoId('..')),
array('$set' => array('solutions.0.1.CXL' => 'something'))
);
Edit
To actually update by the first index then you can do:
$db->collection->update(
['_id' => new \MongoId($id)],
['$set' => ['solutions.0.1.CLX' => 'whatever']]
);
I misread the question in posting the information below:
So what you wanna update all CXL fields in the document (since you are only searching by top level document _id)?
That isn't possible without manually pulling this document out and iterating the subdocuments in the solutions field and then resaving it.
This is becausde there is currently no way of saying, "Update all that match"
This, however, is most likely the JIRA you would want to look for: https://jira.mongodb.org/browse/SERVER-1243
As long as you know you are going to update the second element then use the index of the array to do so. But that problem next. First you need the $set operator in order not to blow away your document and just set the field value:
db.collection.update(
{ _id: ObjectId("53182e32e4b0feedb1dea751") },
{ $set: { "solutions.0.1.CXL": [ 1, 2, 3 ] } }
)
If you just want to add to the array rather than replace the whole thing, then just use $push instead:
db.collection.update(
{ _id: ObjectId("53182e32e4b0feedb1dea751") },
{ $push: { "solutions.0.1.CXL": 4 } }
)
If you are paying attention to the notation, then you will notice that the array index values are present in the field to be updated. There is a very good reason for this, which can be read on the documentation for the positional $ operator.
The issue is that you have a nested array, which as the documentation refers to, causes a problem if you try to match items within that nested array. That problem is, if you try to use the "positional" operator to find the matched index of something you look for in a query, then it will contain the value of the first array index match that it finds.
In this case that would be your "top level" array and the "found" index is 0 and not 1 as you may expect.
Please be aware of this issue if you intend to use nested arrays.
You can update like this:
update({
_id: ObjectId("53182e32e4b0feedb1dea751"),
solutions.id: HK-15501871,
solutions.CLX: "Whatever!",")
},{
$set: {"comments.$.type": abc}
}, false, true
);
You may want to go through this once
http://docs.mongodb.org/manual/reference/method/db.collection.update/