Elasticsearch. Nested query for nested in nested

Elasticsearch. Nested query for nested in nested - php

My mapping is (part of it):
$index = [
"mappings" => [
"goods" => [
"dynamic_templates"=> [
[
"iattribute_id"=> [
"match_mapping_type"=> "string",
"match"=> "attribute_id",
"mapping"=> [
"type"=> "integer"
]
]
],
[
"iattribute_value"=> [
"match_mapping_type"=> "string",
"match"=> "attribute_value",
"mapping"=> [
"type"=> "string",
"index" => "not_analyzed"
]
]
]
],
"properties" => [
...
"individual_attributes" => [
"type" => "nested",
"properties" => [
"template_id" => ["type" => "integer"],
"attributes_set" => [
"type" => "nested",
"properties" => [
"attribute_id" => ["type" => "integer"],
"attribute_value" => ["type" => "string", "index" => "not_analyzed"]
]
]
]
]
...
]
]
]
];
How can I query attribute_id and attribute_value? They are nested inside nested. I can't understand how to specify path to fields.
I've composed query but it doesn't work.
GET /index/type/_search
{
"query" : {
"nested" : {
"path" : "individual_attributes.attributes_set",
"score_mode" : "none",
"filter": {
"bool": {
"must": [
{
"term" : {
"individual_attributes.attributes_set.attribute_id": "20"
}
},
{
"term" : {
"individual_attributes.attributes_set.attribute_value": "commodi"
}
}
]
}
}
}
}
}

Try this:
{
"query": {
"nested": {
"path": "individual_attributes",
"score_mode": "none",
"filter": {
"nested": {
"path": "individual_attributes.attributes_set",
"query": {
"bool": {
"must": [
{
"term": {
"individual_attributes.attributes_set.attribute_id": "20"
}
},
{
"term": {
"individual_attributes.attributes_set.attribute_value": "commodi"
}
}
]
}
}
}
}
}
}
}

Related

Giving priority to prefix match in elasticsearch in php

Is there a way in elasticsearch to give more priority for the prefix match than to the string that contains that word?
For ex.- priorities of words if I search for ram should be like this:
Ram Reddy
Joy Ram Das
Kiran Ram Goel
Swati Ram Goel
Ramesh Singh
I have tried mapping as given in here.
I have done like this:
$params = [
"index" => $myIndex,
"body" => [
"settings"=> [
"analysis"=> [
"analyzer"=> [
"start_with_analyzer"=> [
"tokenizer"=> "my_edge_ngram",
"filter"=> [
"lowercase"
]
]
],
"tokenizer"=> [
"my_edge_ngram"=> [
"type"=> "edge_ngram",
"min_gram"=> 3,
"max_gram"=> 15
]
]
]
],
"mappings"=> [
"doc"=> [
"properties"=> [
"label"=> [
"type"=> "text",
"fields"=> [
"keyword"=> [
"type"=> "keyword"
],
"ngramed"=> [
"type"=> "text",
"analyzer"=> "start_with_analyzer"
]
]
]
]
]
]
]
];
$response = $client->indices()->create($params); // create an index
and searching like this:
$body = [
"size" => 100,
'_source' => $select,
"query"=> [
"bool"=> [
"should"=> [
[
"query_string"=> [
"query"=> "ram*",
"fields"=> [
"value"
],
"boost"=> 5
]
],
[
"query_string"=> [
"query"=> "ram*",
"fields"=> [
"value.ngramed"
],
"analyzer"=> "start_with_analyzer",
"boost"=> 2
]
]
],
"minimum_should_match"=> 1
]
]
];
$params = [
'index' => $myIndex,
'type' => $myType,
'body' => []
];
$params['body'] = $body;
$response = $client->search($params);
The json of query is as follows:
{
"size": 100,
"_source": [
"label",
"value",
"type",
"sr"
],
"query": {
"bool": {
"should": [
{
"query_string": {
"query": "ram*",
"fields": [
"value"
],
"boost": 5
}
},
{
"query_string": {
"query": "ram*",
"fields": [
"value.ngramed"
],
"analyzer": "start_with_analyzer",
"boost": 2
}
}
],
"minimum_should_match": 1,
"must_not": {
"match_phrase": {
"type": "propertyValue"
}
}
}
}
}
I am using elasticsearch 5.3.2
Is there any other way to sort the results for the search in the relational database using the search method in php?

You should not enable fielddata unless really required. To overcome this you can use sub field.
Make the following changes to your code:
"label"=>[
"type"=>"text",
//"fielddata"=> true, ---->remove/comment this line
"analyzer"=>"whitespace",
"fields"=>[
"keyword"=>[
"type"=>"keyword"
]
]
]
To sort on type field use type.keyword instead. This change apply to any field of text type and has a sub-field of type keyword available (assuming the name of this field is keyword). So change as below:
'sort' => [
["type.keyword"=>["order"=>"asc"]],
["sr"=>["order"=>"asc"]],
["propLabels"=>["order"=>"asc"]],
["value"=>["order"=>"asc"]]
]
Update : Index creation and query to get desired output
Create the index as below:
{
"settings": {
"analysis": {
"analyzer": {
"start_with_analyzer": {
"tokenizer": "my_edge_ngram",
"filter": [
"lowercase"
]
}
},
"tokenizer": {
"my_edge_ngram": {
"type": "edge_ngram",
"min_gram": 3,
"max_gram": 15
}
}
}
},
"mappings": {
"_doc": {
"properties": {
"name": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword"
},
"ngramed": {
"type": "text"
}
}
}
}
}
}
}
Use the query below to get the desired result:
{
"query": {
"bool": {
"should": [
{
"query_string": {
"query": "Ram",
"fields": [
"name"
],
"boost": 5
}
},
{
"query_string": {
"query": "Ram",
"fields": [
"name.ngramed"
],
"analyzer": "start_with_analyzer",
"boost": 2
}
}
],
"minimum_should_match": 1
}
}
}
In the above the query with boost value 5 increases the score for those documents where Ram is present in name. The other query with boost 2 further increases the score for the documents where name starts with Ram.
Sample O/P:
"hits": [
{
"_index": "test",
"_type": "_doc",
"_id": "2",
"_score": 2.0137746,
"_source": {
"name": "Ram Reddy"
}
},
{
"_index": "test",
"_type": "_doc",
"_id": "1",
"_score": 1.4384104,
"_source": {
"name": "Joy Ram Das"
}
},
{
"_index": "test",
"_type": "_doc",
"_id": "3",
"_score": 0.5753642,
"_source": {
"name": "Ramesh Singh"
}
}
]

No results once implementing an analyzer in Elasticsearch

I am needing to ignore the apostrophe with indexed results so that searching for "Johns potato" will show results for "John's potato"
I was able to get the analyzer accepted but now I return no search results. Does anyone see something obvious that I am missing?
$params = [
'index' => $index,
'body' => [
'settings' => [
'number_of_shards' => 5,
'number_of_replicas' => 2,
'analysis' => [
"analyzer" => [
"my_analyzer" => [
"tokenizer" => "keyword",
"char_filter" => [
"my_char_filter"
]
]
],
"char_filter" => [
"my_char_filter" => [
"type" => "mapping",
"mappings" => [
"' => "
]
]
]
]
],
'mappings' => [
$type => [
'_source' => [
'enabled' => true
],
'properties' => [
'title' => [
'type' => 'text',
'analyzer' => 'my_analyzer'
],
'content' => [
'type' => 'text',
'analyzer' => 'my_analyzer'
]
]
]
]
]
];
I did find out that removing the analyzer from my field mappings allowed results to reappear, but I get no results the second I add the analyzer.
Here's an example query that I make.
{
"body": {
"query": {
"bool": {
"must": {
"multi_match": {
"query": "apples",
"fields": [
"title",
"content"
]
}
},
"filter": {
"terms": {
"site_id": [
"1351",
"1349"
]
}
},
"must_not": [
{
"match": {
"visible": "false"
}
},
{
"match": {
"locked": "true"
}
}
]
}
}
}
}

Probably, what you really want, is to use the english analyzer that is provided. The standard analyzer which is the default will tokenize on whitespace and some punctuation, but will leave apostrophes alone. The english analyzer can stem and remove stop words since the language is known.
Here is the standard analyzer's output, where you can see "john's":
POST _analyze
{
"analyzer": "standard",
"text": "John's potato"
}
{
"tokens": [
{
"token": "john's",
"start_offset": 0,
"end_offset": 6,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "potato",
"start_offset": 7,
"end_offset": 13,
"type": "<ALPHANUM>",
"position": 1
}
]
}
And here is the english analyzer where you can see the 's is removed. The stemming will allow "John's", "Johns", and "John" to all match the document.
POST _analyze
{
"analyzer": "english",
"text": "John's potato"
}
{
"tokens": [
{
"token": "john",
"start_offset": 0,
"end_offset": 6,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "potato",
"start_offset": 7,
"end_offset": 13,
"type": "<ALPHANUM>",
"position": 1
}
]
}

ElasticSearch query with diacritics / accents in PHP

I have the following expression: "noapte bună" and I'm trying to get the same result when I'm searching for "bună" or "buna".
I have followed to tutorial here : https://www.elastic.co/guide/en/elasticsearch/guide/current/asciifolding-token-filter.html but to no result.
This is my code:
$params = ['index' => 'asciiv3', 'body' => [
"settings" => [
"analysis" => [
"analyzer" => [
"folding" => [
"tokenizer" => "standard",
"filter" => [ "lowercase", "asciifolding" ]
]
]
]
],
"mappings" => [
"asciiv3" => [
"properties" => [
"saying" => [
"type" => "string",
"analyzer" => "standard",
"fields" => [
"folded" => [
"type" => "string",
"analyzer" => "folding"
]
]
]
]
]
]
]];
self::$instance->indices()->create($params);
and this is the query array:
'multi_match' =>
array(
"type" => "most_fields",
"query" => "bună",
"fields" => [ "saying", "saying.folded" ]
)
Does anyone know what I'm doing wrong?

It works for me. This is my setup:
PUT asciiv3
{
"settings": {
"analysis": {
"analyzer": {
"folding": {
"tokenizer": "standard",
"filter": [
"lowercase",
"asciifolding"
]
}
}
}
},
"mappings": {
"asciiv3": {
"properties": {
"saying": {
"type": "string",
"analyzer": "standard",
"fields": {
"folded": {
"type": "string",
"analyzer": "folding"
}
}
}
}
}
}
}
POST /asciiv3/asciiv3/1
{
"saying":"bună ziua"
}
POST /asciiv3/asciiv3/2
{
"saying":"buna ziua"
}
GET /asciiv3/_search
{
"query": {
"multi_match": {
"type": "most_fields",
"query": "bună",
"fields": [
"saying",
"saying.folded"
]
}
}
}
With these results:
"hits": {
"total": 2,
"max_score": 0.2712221,
"hits": [
{
"_index": "asciiv3",
"_type": "asciiv3",
"_id": "1",
"_score": 0.2712221,
"_source": {
"saying": "bună ziua"
}
},
{
"_index": "asciiv3",
"_type": "asciiv3",
"_id": "2",
"_score": 0.028130025,
"_source": {
"saying": "buna ziua"
}
}
]
}

Change the JSON format

I am working with drupal 8. I am trying to get the JSON of all nodes of the content type. I got a json as given bellow. But Now I want to change the Following JSON to
[
{
"nid": [
{
"value": "17"
}
],
"uuid": [
{
"value": "3614e0c8-88d4-4e8d-a732-5089698556d5"
}
],
"vid": [
{
"value": "17"
}
],
"type": [
{
"target_id": "resume_creator"
}
],
"langcode": [
{
"value": "en"
}
],
"title": [
{
"value": "uyi"
}
],
"uid": [
{
"target_id": "1"
}
],
"status": [
{
"value": "1"
}
],
"created": [
{
"value": "1452060690"
}
],
"changed": [
{
"value": "1452060709"
}
],
"promote": [
{
"value": "1"
}
],
"sticky": [
{
"value": "0"
}
],
"revision_timestamp": [
{
"value": "1452060709"
}
],
"revision_uid": [
{
"target_id": "1"
}
],
"revision_log": [],
"revision_translation_affected": [
{
"value": "1"
}
],
"default_langcode": [
{
"value": "1"
}
],
"path": [],
"field_communication_address": [
{
"value": "rtyrtytr\r\nuu;\r\nsdgfdh"
}
],
"field_education": [
{
"value": "ytutyuii"
}
],
"field_emails": [
{
"value": "gtf#fgfg.com"
}
],
"field_experiece": [
{
"value": "fghtutyu"
}
],
"field_name": [
{
"value": "ytt"
}
]
}
]
to a format of
[
{
"nid":"17",
"uuid":"3614e0c8-88d4-4e8d-a732-5089698556d5",
"vid": "17",
"type":"resume_creator",
"langcode":"en",
"title":"uyi",
"uid":"1",
"status":"1",
"created":"1452060690",
"changed":"1452060709",
"promote":"1",
"sticky":"0",
"revision_timestamp":"1452060709",
"revision_uid":"1",
"revision_log": [],
"path":[],
"field_communication_address":"rtyrtytr\r\nuu;\r\nsdgfdh",
"field_education":"ytutyuii",
"field_emails":"gtf#fgfg.com",
"field_experiece":"fghtutyu",
"field_name":"ytt"
}
]
using php. Then only I can manage a form angular js. Thanks in advance

Try this
$json = '{
"nid": [
{
"value": "17"
}
],
"uuid": [
{
"value": "3614e0c8-88d4-4e8d-a732-5089698556d5"
}
],
"vid": [
{
"value": "17"
}
],
"type": [
{
"target_id": "resume_creator"
}
],
"langcode": [
{
"value": "en"
}
],
"title": [
{
"value": "uyi"
}
],
"uid": [
{
"target_id": "1"
}
],
"status": [
{
"value": "1"
}
],
"created": [
{
"value": "1452060690"
}
],
"changed": [
{
"value": "1452060709"
}
],
"promote": [
{
"value": "1"
}
],
"sticky": [
{
"value": "0"
}
],
"revision_timestamp": [
{
"value": "1452060709"
}
],
"revision_uid": [
{
"target_id": "1"
}
],
"revision_log": [],
"revision_translation_affected": [
{
"value": "1"
}
],
"default_langcode": [
{
"value": "1"
}
],
"path": [],
"field_communication_address": [
{
"value": "rtyrtytr\r\nuu;\r\nsdgfdh"
}
],
"field_education": [
{
"value": "ytutyuii"
}
],
"field_emails": [
{
"value": "gtf#fgfg.com"
}
],
"field_experiece": [
{
"value": "fghtutyu"
}
],
"field_name": [
{
"value": "ytt"
}
]
}';
$json = json_decode($json,true);
foreach ($json as $key => $value){
if(isset($json[$key][0]['value'])){
$json[$key] = $json[$key][0]['value'];
}
if(isset($json[$key][0]['target_id'])){
$json[$key] = $json[$key][0]['target_id'];
}
// $json[$key] = $json[$key][0]['value'];
}
$json = json_encode($json);
print_r($json);

It is simple.
<?php
$arr = array('nid' => 17, 'uuid' => '3614e0c8-88d4-4e8d-a732-5089698556d5', ...);
echo json_encode($arr);
?>
If you have some misunderstanding, ask me.

Parse Huge GeoJSON file and get polygon coordinates of a specific property

How to get the polygon coordinates of a specific property. It's very a huge file so the time to parse the file is a factor.
Is there a library to do that?
Sample of the geojson:
{
"type": "FeatureCollection",
"crs": { "type": "name", "properties": { "name": "urn:ogc:def:crs:EPSG::37001" } },
"features": [
{ "type": "Feature", "properties": { "HOOD_ID": 2799.000000, "HOOD_NAME": "Overtown", "MARKET_ID": "MK1245000", "MARKET": "Miami", "STATE": "12", "STATENAME": "Florida", "LATITUDE": 25.784659, "LONGITUDE": -80.202625, "AREA": 1.495920, "HLEVEL": 2.000000, "DATE_ADDED": "2012\/08\/04", "FLAG1": 0, "OB_GEO_ID": "NH2799" }, "geometry": { "type": "Polygon", "coordinates": [ [ [ -80.21463341110001, 25.782154451300002 ], [ -80.21588353300001, 25.782696872700001 ], [ -80.217973576800006, 25.7833078056 ], [ -80.219539583200003, 25.784199528800002 ], [ -80.211531118000011, 25.787386122500003 ], [ -80.20836940560001, 25.789128957700001 ], [ -80.206422272200001, 25.789848709300003 ], [ -80.2060101207, 25.7907922853 ], [ -80.206013661300005, 25.793566844899999 ], [ -80.206013794, 25.7968569831 ], [ -80.202368489099996, 25.796952708299997 ], [ -80.202379, 25.797313 ], [ -80.199836, 25.797309 ], [ -80.199819759600004, 25.7970196375 ], [ -80.1993398571, 25.797032239699998 ], [ -80.193583490500004, 25.797234161599999 ], [ -80.193806159800005, 25.796203267299997 ], [ -80.194272724399994, 25.7951752727 ], [ -80.193944, 25.795182 ], [ -80.194266, 25.793434 ], [ -80.195336, 25.789592 ], [ -80.195534, 25.787847 ], [ -80.195514, 25.778409 ], [ -80.195969425200005, 25.778397321299998 ], [ -80.19557104899999, 25.773179598799999 ], [ -80.195360063199999, 25.768486166300001 ], [ -80.196768768399991, 25.7682545324 ], [ -80.198226099099998, 25.768721241800002 ], [ -80.199164023899996, 25.769800189500003 ], [ -80.199997701599997, 25.770738292499999 ], [ -80.200414826200003, 25.772286616100001 ], [ -80.200936435800003, 25.773272690900001 ], [ -80.202343232900006, 25.7749143389 ], [ -80.204375245, 25.776884093299998 ], [ -80.205990323199998, 25.777259031 ], [ -80.206835373600001, 25.777897973199998 ], [ -80.207587, 25.777601 ], [ -80.210881, 25.78 ], [ -80.21463341110001, 25.782154451300002 ] ] ] } },
{ "type": "Feature", "properties": { "HOOD_ID": 2169.000000, "HOOD_NAME": "Church District", "MARKET_ID": "MK1235000", "MARKET": "Jacksonville", "STATE": "12", "STATENAME": "Florida", "LATITUDE": 30.332174, "LONGITUDE": -81.660212, "AREA": 0.131745, "HLEVEL": 1.000000, "DATE_ADDED": "2012\/08\/04", "FLAG1": 0, "OB_GEO_ID": "NH2169" }, "geometry": { "type": "Polygon", "coordinates": [ [ [ -81.664799, 30.331204 ], [ -81.663868, 30.334826 ], [ -81.655617, 30.333239 ], [ -81.656717, 30.329439 ], [ -81.664799, 30.331204 ] ] ] } }
}

Large files can best be parsed using an event-based JSON parser (here I use one by kuma-giyomu). The idea is to use callbacks when a certain token is encountered, so that the processing of the data can be done in between parsing.
In the following code, the property "coordinates" is used to trigger the creation of a new polygon object, and then the start function of the array handler to start a new coordinate array and it is submitted to the polygon object when an array end token is encountered.
<?php
include "JSONParser.php";
class Polygon {
public $coordinates = array();
}
$coords = null;
$polygons = array();
$polygon = null;
$j = new JSONParser();
$j->setPropertyHandler(function($value, $property) {
global $polygons, $polygon;
if ($value != "coordinates") {
if (!is_null($polygon)) {
$polygons[] = $polygon;
$polygon = null;
}
return;
}
if (is_null($polygon)) {
$polygon = new Polygon;
}
});
$j->setArrayHandlers(function($value, $property) {
global $coords, $polygon;
if (!is_null($polygon)) {
$coords = array();
}
}, function($value, $property) {
global $coords, $polygon;
if (!is_null($coords)) {
if (!is_null($polygon)) {
$polygon->coordinates[] = $coords;
}
$coords = null;
}
});
$j->setScalarHandler(function($value, $property) {
global $coords;
if (!is_null($coords)) {
$coords[] = $value;
}
});
try {
$j->parseDocument("test.json");
} catch (JSONParserException $e) {
}
if (!is_null($polygon)) {
$polygons[] = $polygon;
$polygon = null;
}
print_r($polygons);
outputs
Array
(
[0] => Polygon Object
(
[coordinates] => Array
(
[0] => Array
(
[0] => -80.21463341110001
[1] => 25.782154451300002
)
[1] => Array
(
[0] => -80.21588353300001
[1] => 25.782696872700001
)
[...]

We Keep Coding

PHP, A popular general-purpose scripting language that is especially suited to web development.

Elasticsearch. Nested query for nested in nested - php

Related

Giving priority to prefix match in elasticsearch in php

No results once implementing an analyzer in Elasticsearch

ElasticSearch query with diacritics / accents in PHP

Change the JSON format

Parse Huge GeoJSON file and get polygon coordinates of a specific property

Categories

Resources