Elasticsearch sorting only by alphabetical not by numeric - php

I am having problem with sorting in PHP, here is my mapping:
{
"jdbc": {
"mappings": {
"jdbc": {
"properties": {
"admitted_date": {
"type": "date",
"format": "dateOptionalTime"
},
"et_tax": {
"type": "string"
},
"jt_tax": {
"type": "string"
},
"loc_cityname": {
"type": "string"
},
"location_countryname": {
"type": "string"
},
"location_primary": {
"type": "string"
},
"pd_firstName": {
"type": "string"
}
}
}
}
}
}
When I use order the result by sort, it will order the results with alphanumeric, it will load the results with numeric as first. I need to order the results only starting letter alphabets. Now it orders like this:
http://localhost:9200/jdbc/_search?pretty=true&sort=pd_lawFirmName:asc
BM&A
Gomez-Acebo & Pombo
Addleshaw Goddard
How to order the results like this?
Addleshaw Goddard
BM&A
Gomez-Acebo & Pombo
Here is my query i using for indexing
{
"type" : "jdbc",
"jdbc" : {
"driver" : "com.mysql.jdbc.Driver",
"url" : "jdbc:mysql://localhost:3306/dbname",
"user" : "user",
"password" : "pass",
"sql" : "SQL QUERY",
"poll" : "24h",
"strategy" : "simple",
"scale" : 0,
"autocommit" : true,
"bulk_size" : 5000,
"max_bulk_requests" : 30,
"bulk_flush_interval" : "5s",
"fetchsize" : 100,
"max_rows" : 149669,
"max_retries" : 3,
"max_retries_wait" : "10s",
"locale" : "in",
"digesting" : true,
"mappings": {
"sorting": {
"properties": {
"pd_lawFirmName": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
}
}
}

This is like that because Elasticsearch will tokenize the text using the default analyzer, which is standard. For example, McDermott Will Amery is indexed like:
"amery",
"mcdermott",
"will"
If you want to sort like that, I would suggest to change the mapping of your pd_lawFirmName in something like this:
"pd_lawFirmName": {
"type": "string",
"fields": {
"raw": {
"type": "string",
"index": "not_analyzed"
}
}
}
and sort by the raw subfield:
http://localhost:9200/jdbc/_search?pretty=true&sort=pd_lawFirmName.raw:asc

Related

Searching for exact phrase with synonyms

I am trying to build a query, where I am using exact phrase match and synonyms and I can't figure it out. Also, when using wildcard approach I don't know how to use fuzziness. Is it even possible with wildcards? It would be great to get same results for terms "call of duty", "cod" or "call of dutz".
I have created this index:
PUT exact_search
{
"settings": {
"index": {
"number_of_shards": "1",
"number_of_replicas": "0",
"analysis": {
"analyzer": {
"analyzer_exact": {
"type": "custom",
"tokenizer": "keyword",
"filter": [
"lowercase",
"icu_folding",
"synonyms"
]
}
},
"filter": {
"synonyms": {
"type": "synonym",
"synonyms_path": "synonyms.txt"
}
}
}
}
},
"mappings": {
"properties": {
"name": {
"type": "keyword",
"fields": {
"analyzer_exact": {
"type": "text",
"analyzer": "analyzer_exact"
}
}
}
}
}
}
And I fill it with these items:
POST exact_search/_doc/1
{
"name": "Hoodie Call of Duty"
}
POST exact_search/_doc/2
{
"name": "Call of Duty 2"
}
POST exact_search/_doc/3
{
"name": "Call of Duty: Modern Warfare 2"
}
POST exact_search/_doc/4
{
"name": "COD: Modern Warfare 2"
}
POST exact_search/_doc/5
{
"name": "Call of duty"
}
POST exact_search/_doc/6
{
"name": "Call of the sea"
}
POST exact_search/_doc/7
{
"name": "Heavy Duty"
}
synonyms.txt looks like this:
cod,call of duty
And what I am trying to achieve is, to get all the results (exept call of the sea and heavy duty) when I search "call of duty" or "cod".
So far, I constructed this query, but it does not work as expected when using "cod" search term (term "call of duty" works fine):
GET exact_search/_search
{
"explain": false,
"query":{
"bool":{
"must":[
{
"wildcard": {
"name.analyzer_exact": {
"value": "*cod*"
}
}
}
]
}
}
}
But the result is only two items:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "exact_search",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"name" : "COD: Modern Warfare 2"
}
},
{
"_index" : "exact_search",
"_id" : "5",
"_score" : 1.0,
"_source" : {
"name" : "Call of duty"
}
}
]
}
}
It looks like that the synonyms are working, because it returns "call of duty" game, but it ignores the wildcards - it won't return Call of Duty 2 for example.
I need to look for the exact phrase match, because I dont't want to get results Heavy Duty or Call of the sea (when words "call" and "duty" match).
Thank you for pointing me in the right direction.
I have my doubts if the analyzer would generate the tokens synonymous with the analyzer_exact "tokenizer": "keyword".
I would change a few things to make it work.
keyword -> standard
"analyzer_exact": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"lowercase",
"synonyms"
]
}
I would use match phrase to eliminate names other than call of duty and cod.
{
"match_phrase": {
"name.analyzer_exact": "cod"
}
}
Response after changes
{
"hits": {
"hits": [
{
"_source": {
"name": "Call of duty"
}
},
{
"_source": {
"name": "COD: Modern Warfare 2"
}
},
{
"_source": {
"name": "Call of Duty 2"
}
},
{
"_source": {
"name": "hoddies Call of Duty"
}
},
{
"_source": {
"name": "Call of Duty: Modern Warfare 2"
}
}
]
}

Multiple search field elasticsearphp

Hello i want to do something like that with elasticsearch enter image description here
I already have some knowledge in elasticsearch but I can't understand how can I do this , multiple search
You can use a combination of bool/must/should clause to combine multiple conditions
{
"query": {
"bool": {
"should": [
{
"multi_match": {
"query": "tag"
}
},
{
"match": {
"answers": 0
}
},
{
"match": {
"user": 1234
}
},
{
"multi_match": {
"query": "words here",
"type": "phrase"
}
},
{
"match": {
"score": 3
}
},
{
"match": {
"isaccepted": "yes"
}
}
]
}
}
}
If you want to search on multiple fields then you can use multi_match query
If no fields are provided, the multi_match query defaults to the
index.query.default_field index settings, which in turn defaults to *.
This extracts all fields in the mapping that are eligible to term queries and filters the metadata fields. All extracted fields are then
combined to build a query.
Adding a working example with index data, search query, and search result
Index Data:
{
"answers": 0,
"isaccepted": "no"
}
{
"answers": 0,
"isaccepted": "yes"
}
Search Query:
{
"query": {
"multi_match" : {
"query" : "yes"
}
}
}
Search Result:
"hits": [
{
"_index": "67542669",
"_type": "_doc",
"_id": "1",
"_score": 0.2876821,
"_source": {
"answers": 0,
"isaccepted": "yes"
}
}
]

Elastic search geo point query filter

I am trying to use geo_point for distance but it always shows location type double not geo_point how can I set location mapped to geo_point .
Actually I have to find all records within 5km sorted.
"pin" : {
"properties" : {
"location" : {
"properties" : {
"lat" : {
"type" : "double"
},
"lon" : {
"type" : "double"
}
}
},
"type" : {
"type" : "string"
}
}
},
and when I am trying searches with this query below to find result within 5km of delhi lat long :
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"pin": {
"distance": "5",
"distance_unit": "km",
"coordinate": {
"lat": 28.5402707,
"lon": 77.2289643
}
}
}
}
}
}
It shows me error query_parsing_exception and No query registered for [pin]. I cannot figure out the problem. It always throws this exception
{
"error": {
"root_cause": [
{
"type": "query_parsing_exception",
"reason": "No query registered for [pin]",
"index": "find_index",
"line": 1,
"col": 58
}
],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [
{
"shard": 0,
"index": "find_index",
"node": "DtpkEdLCSZCr8r2bTd8p5w",
"reason": {
"type": "query_parsing_exception",
"reason": "No query registered for [pin]",
"index": "find_index",
"line": 1,
"col": 58
}
}
]
},
"status": 400
}
Please help me to figure out this problem. how can I set geo_point and solve this exception error and status 400 and all_shards_failed error
You simply need to map your pin type like this, i.e. using the geo_point data type:
# 1. first delete your index
DELETE shouut_find_index
# 2. create a new one with the proper mapping
PUT shouut_find_index
{
"mappings": {
"search_shouut": {
"properties": {
"location": {
"type": "geo_point"
},
"type": {
"type": "string"
}
}
}
}
}
Then you can index a document like this
PUT shouut_find_index/search_shouut/1
{
"location": {
"lat": 28.5402707,
"lon": 77.2289643
},
"type": "dummy"
}
And finally your query can look like this
POST shouut_find_index/search_shouut/_search
{
"query": {
"filtered": {
"filter": {
"geo_distance": {
"distance": "5km",
"location": {
"lat": 28.5402707,
"lon": 77.2289643
}
}
}
}
}
}

ElasticSearch Aggregations with script doc_values

I have a field "location_facet", which is a strig with mapping
"location_facet": {
"type": "string",
"index": "not_analyzed",
"include_in_all": true
},
In this field I have location_ids and I would like to aggregate on them by the query. Also I want to become some information by the aggregation, that's why I would like to execute a script, for example I want to become the name of the city, but there is always an error coming:
"aggs": {
"location_radius": {
"terms": {
"field": "location_facet",
"size": 10,
"script": "doc[\"location_name\"].value"
}
}
},
"type": "script_exception",
"reason": "failed to run inline script [doc[\"location_name\"].value] using lang [groovy]"
I have actually this implementation with Facets:
$tagFacet = new \Elastica\Facet\Terms("location_radius");
$tagFacet->setField('location_name');
$tagFacet->setAllTerms(true);
$tagFacet->setSize(100);
$tagFacet->setScript(
"doc['location'].empty ? null : ceil(doc['location'].arcDistanceIn".$unit."(".
$entries->getLocationLatitude().", ".
$entries->getLocationLongitude().")) + '|' + doc['location_name'].value
+ '|' + doc['location_id'].value"
);
What I am doing wrong?
If I test this one:
"aggs": {
"location_radius": {
"terms": {
"field": "location_facet",
"size": 10,
"script": "doc['location'].empty ? null : ceil(doc['location'].arcDistanceInKm(51.2249429, 6.7756524))"
}
}
},
I get the error:
"type": "script_exception",
"reason": "failed to run inline script [doc['location'].empty ? null : ceil(doc['location'].arcDistanceInKm(51.2249429, 6.7756524))] using lang [groovy]",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Negative position"
}
Solution: All the doc fields I can use need to be also "index": "not_analyzed".

ElasticSearch PHP - get list of 10 (most recent) items in type/index

I have a index called publications_items and type called "publication".
I'd like to have the 10 most recent added publications. (some matchAll principle)
I'm using ElasticSearch PHP (http://www.elasticsearch.org/guide/en/elasticsearch/client/php-api/current/_quickstart.html)
Basically i'm just doing a default get but i would not know how to do this in ElasticSearchPHP.
In sense.qbox.io i do :
POST /publications_items/_search { "query": {
"match_all": {} } }
and it works fine
mapping:
PUT /publications_items/ { "mappings": {
"publication": {
"properties": {
"title": {
"type": "string"
},
"url": {
"type": "string"
},
"description": {
"type": "string"
},
"year": {
"type": "integer"
},
"author": {
"type": "string"
}
}
} } }
You need to enable "_timestamp" mapping:
PUT /test/doc/_mapping
{
"_timestamp": {
"enabled": "true",
"store": "true"
}
}
And in your search query you need to sort by it and retrieve the first 10 documents:
GET /test/_search
{
"sort" : {
"_timestamp" : { "order" : "desc" }
},
"from" : 0, "size" : 10
}
And specifically in Elasticsearch PHP:
mappings change:
require 'vendor/autoload.php';
$client = new Elasticsearch\Client();
$params = array();
$params2 = [
'_timestamp' => [
'enabled' => 'true',
'store' => 'true'
]
];
$params['index']='test';
$params['type']='doc';
$params['body']['doc']=$params2;
$client->indices()->putMapping($params);
query:
require 'vendor/autoload.php';
$client = new Elasticsearch\Client();
$json = '{
"sort" : {
"_timestamp" : { "order" : "desc" }
},
"from" : 0, "size" : 10
}';
$params['index'] = 'test';
$params['type'] = 'doc';
$params['body'] = $json;
$results = $client->search($params);
echo json_encode($results, JSON_PRETTY_PRINT);

Categories