Related
I have a codeigniter application where I try to set up elasticsearch, I have my json which looks like this:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1.0,
"hits": [
{
"_index": "features",
"_type": "liste",
"_id": "test",
"_score": 1.0,
"_source": {
"mnuActiveSection": "securite.features",
"rowsFeatures": [
{
"id": "50",
"idCategory": "5",
"Name": "01- Data",
"FeatureCategory": "Serie",
"NombrePrivileges": "9",
"ListePrivileges": "Admin"
},
{
"id": "51",
"idCategory": "5",
"Name": "02- Data 2",
"FeatureCategory": "Documentary",
"NombrePrivileges": "3",
"ListePrivileges": "Direction"
},
{
"id": "52",
"idCategory": "5",
"Name": "03- Data 3",
"FeatureCategory": "Films",
"NombrePrivileges": "7",
"ListePrivileges": "Super Admin"
}
]
}
}
]
}
}
I would like to display the list of features in my view, and to search the documents via elastic search. unfortunately I can not display the data json.
This is not what I will like, because I have the impression that I have all my data grouped in a single object array.
To be done I try to use a "foreach" but unfortunately I do not get what I'm looking for. I am stuck at this stage.
I would like for each data of my table to have a different _id for example to better visualize them.
public function liste()
{
$data["mnuActiveSection"] = "securite.features";
$Finder = new FeaturesFinder();
$data["rowsFeatures"] = $Finder->FindAll();
$this->load->library('elasticsearch');
foreach($data["rowsFeatures"] as $d){
$this->elasticsearch->index = 'features';
$this->elasticsearch->create();
$this->elasticsearch->add($type ='liste',$id = 'test',$data);
var_dump($d);
}
}
If anyone could help me find a track I'm interested.
thank you very much
I think what you're trying to do is this:
$i = 0;
foreach($data["rowsFeatures"] as $d){
$i++;
$this->elasticsearch->index = 'features';
$this->elasticsearch->create();
$this->elasticsearch->add($type ='liste',$id = $i,$data);
var_dump($d);
}
I am writing testing methods of my app and in my app I use elasticsearch. When I run a test method which should return values using elasticsearch, the response is always empty. How can I solve the problem? Here is the code I send.
public function testGetPosts()
{
$brand = factory(Brand::class)->create();
$account = factory(Account::class)->create();
$post = factory(Post::class)->create();
$response = $this->actingAs($this->owner)->json(
'GET',
('/api/publish/posts'),
['account_id' => [(string) $account->id],
'skip' => 0]
);
$response->assertStatus(200);
}
I know this post is old, but I add there the answer I found for this problem.
All you need to ensure your data is indexed before querying is to call a refresh on index you just wrote on.
It forces ES to index data, so you are sure data is there when you query it!
And it is faster than the sleep(1); as suggested by author =)
You can find the official ElasticSearch documentation about it here.
Hope this will help someone.
Almost a year, later, I'm sure by now you've moved on.
You stated:
Elastic search doesnt index the created post. It shuold be indexed
Why would it be indexed? Unless, of course you have code to index in your setUp(), or your testing against an external ES server and assuming it's always available and contains the exact data you're testing against.
Another solution is to mock the request, since Elasticsearch returns JSON. All we need to do is mock a HTTP request that has a status of 200, and returns JSON. This JSON file we can place in our tests/ directory, and it will contain the sample results that Elasticsearch would return.
An example test would like this;
$handler = new MockHandler([
'status' => 200,
'transfer_stats' => [
'total_time' => 100
],
'body' => fopen(base_path('tests/Unit/mockelasticsearch.json'), 'r')
]);
$builder = ClientBuilder::create();
$builder->setHosts(['testing']);
$builder->setHandler($handler);
$client = $builder->build();
$response = $client->search([
'index' => 'my_index',
'type' => 'my_type',
'body' => [
[
'query' => [
'simple_query_string' => [
'query' => 'john',
'fields' => ['name']
]
]
]
]
]);
// Test against the "$response", i.e., $this->assertEquals(2 ...) etc.
Then in the JSON file, which you would need to customize based on your index;
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 121668,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "test-type",
"_id": "1111",
"_score": 1,
"_source": {
"id": "1111",
"title": "Some Foo",
"timestamp": "2017-08-02T15:45:22-05:00"
}
},
{
"_index": "test",
"_type": "test-type",
"_id": "2222",
"_score": 1,
"_source": {
"id": "2222",
"title": "Dolor Sit Amet",
"timestamp": "2017-08-02T15:45:22-05:00"
}
},
{
"_index": "test",
"_type": "test-type",
"_id": "3333",
"_score": 1,
"_source": {
"id": "3333",
"title": "Consectetur Adipiscing Elit",
"timestamp": "2017-08-02T15:45:22-05:00"
}
},
{
"_index": "test",
"_type": "test-type",
"_id": "4444",
"_score": 1,
"_source": {
"id": "4444",
"title": "Sed Do Eiusmod",
"timestamp": "2017-08-02T15:45:22-05:00"
}
},
{
"_index": "test",
"_type": "test-type",
"_id": "5555",
"_score": 1,
"_source": {
"id": "5555",
"title": "Tempor Incididunt",
"timestamp": "2017-08-02T15:45:22-05:00"
}
},
{
"_index": "test",
"_type": "test-type",
"_id": "6666",
"_score": 1,
"_source": {
"id": "6666",
"title": "Ut Labore Et Dolore",
"timestamp": "2017-08-02T15:45:22-05:00"
}
},
{
"_index": "test",
"_type": "test-type",
"_id": "7777",
"_score": 1,
"_source": {
"id": "7777",
"title": "Magna Aliqua",
"timestamp": "2017-08-02T15:45:22-05:00"
}
},
{
"_index": "test",
"_type": "test-type",
"_id": "8888",
"_score": 1,
"_source": {
"id": "8888",
"title": "Ut Enim Ad Minim",
"timestamp": "2017-08-02T15:45:22-05:00"
}
},
{
"_index": "test",
"_type": "test-type",
"_id": "9999",
"_score": 1,
"_source": {
"id": "9999",
"title": "Veniam, Quis Nostrud",
"timestamp": "2017-08-02T15:45:22-05:00"
}
},
{
"_index": "test",
"_type": "test-type",
"_id": "0000",
"_score": 1,
"_source": {
"id": "0000",
"title": "Exercitation Ullamco Laboris",
"timestamp": "2017-08-02T15:45:22-05:00"
}
}
]
}
}
I try find subdomains by main domain in elasticsearch.
I added few domains to elastic:
$domains = [
'site.com',
'ns1.site.com',
'ns2.site.com',
'test.main.site.com',
'sitesite.com',
'test-site.com',
];
foreach ($domains as $domain) {
$params = [
'index' => 'my_index',
'type' => 'my_type',
'body' => ['domain' => $domain],
];
$client->index($params);
}
Then I try to search:
$params = [
'index' => 'my_index',
'type' => 'my_type',
'body' => [
'query' => [
'wildcard' => [
'domain' => [
'value' => '.site.com',
],
],
],
],
];
$response = $client->search($params);
But found nothing. :(
My mapping is:
https://pastebin.com/raw/k9MzjJUM
Any ideas to fix it?
Thanks
You're almost there, just a couple of things missing.
How to make an "ends with" query?
It's enough to add * in your query (that's why this query is called wildcard):
POST my_index/my_type/_search
{
"query": {
"wildcard" : { "domain" : "*.site.com" }
}
}
This will give you the following result:
{
...
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "my_index",
"_type": "my_type",
"_id": "RoE8VGMBRuo1XmkIXhp0",
"_score": 1,
"_source": {
"domain": "test.main.site.com"
}
}
]
}
}
Seems to work, but we only get one of the results (not all of them).
Why it returns not all matching documents?
Returning to your mapping, the field domain has type text:
PUT my_index
{
"mappings": {
"my_type": {
"properties": {
"domain": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
}
}
}
}
This means that content of that field will be tokenized and lowercased (with standard analyzer). You can see which tokens will be actually searchable using _analyze API, like this:
POST _analyze
{
"text": "test.main.site.com"
}
{
"tokens": [
{
"token": "test.main.site.com",
"start_offset": 0,
"end_offset": 18,
"type": "<ALPHANUM>",
"position": 0
}
]
}
That's why wildcard query could match test.main.site.com.
What if we take n1.site.com?
POST _analyze
{
"text": "n1.site.com"
}
{
"tokens": [
{
"token": "n1",
"start_offset": 0,
"end_offset": 2,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "site.com",
"start_offset": 3,
"end_offset": 11,
"type": "<ALPHANUM>",
"position": 1
}
]
}
As you can see, there is no token that ends with .site.com (note the . before the site.com).
Fortunately, your mapping is already capable to return all results.
How to return all the results for "ends with" query?
You could use keyword field, which uses the exact value for querying:
POST my_index/my_type/_search
{
"query": {
"wildcard" : { "domain.keyword" : "*.site.com" }
}
}
This will give you the following result:
{
"hits": {
"total": 3,
"max_score": 1,
"hits": [
{
"_index": "my_index",
"_type": "my_type",
"_id": "RoE8VGMBRuo1XmkIXhp0",
"_score": 1,
"_source": {
"domain": "test.main.site.com"
}
},
{
"_index": "my_index",
"_type": "my_type",
"_id": "Q4E8VGMBRuo1XmkIFRpy",
"_score": 1,
"_source": {
"domain": "ns1.site.com"
}
},
{
"_index": "my_index",
"_type": "my_type",
"_id": "RYE8VGMBRuo1XmkIORqG",
"_score": 1,
"_source": {
"domain": "ns2.site.com"
}
}
]
}
}
Is this the best way to do "ends with"-like queries?
Actually, no. wildcard queries can be very slow:
Note that this query can be slow, as it needs to iterate over many
terms. In order to prevent extremely slow wildcard queries, a wildcard
term should not start with one of the wildcards * or ?.
To achieve best performance, in your case, I would suggest creating another field, higherLevelDomains, and manually extracting the higher level domains from the original. The document might look like:
POST my_index/my_type
{
"domain": "test.main.site.com",
"higherLevelDomains": [
"main.site.com",
"site.com",
"com"
]
}
This will allow you to use term query:
POST my_index/my_type/_search
{
"query": {
"term" : { "higherLevelDomains.keyword" : "site.com" }
}
}
This is probably the most efficient query you can get with Elasticsearch for such task.
Hope that helps!
Suppose I have stored bellow data and want to search for term xy in old_value and new_value fields of those documents that their field_name is curriculum_name_en or curriculum_name_pr:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 98,
"max_score": 1,
"hits": [
{
"_index": "my_index",
"_type": "audit_field",
"_id": "57526c197e83c",
"_score": 1,
"_source": {
"session_id": 119,
"trans_seq_no": 1,
"table_seq_no": 1,
"field_id": 2,
"field_name": "curriculum_id",
"new_value": 118,
"old_value": null
}
},
{
"_index": "my_index",
"_type": "audit_field",
"_id": "57526c197f2c3",
"_score": 1,
"_source": {
"session_id": 119,
"trans_seq_no": 1,
"table_seq_no": 1,
"field_id": 3,
"field_name": "curriculum_name_en",
"new_value": "Test Index creation",
"old_value": null
}
},
{
"_index": "my_index",
"_type": "audit_field",
"_id": "57526c198045c",
"_score": 1,
"_source": {
"session_id": 119,
"trans_seq_no": 1,
"table_seq_no": 1,
"field_id": 4,
"field_name": "curriculum_name_pr",
"new_value": null,
"old_value": null
}
},
{
"_index": "my_index",
"_type": "audit_field",
"_id": "57526c1981512",
"_score": 1,
"_source": {
"session_id": 119,
"trans_seq_no": 1,
"table_seq_no": 1,
"field_id": 5,
"field_name": "curriculum_name_pa",
"new_value": null,
"old_value": null
}
}
]
}
}
and many more fields may be there, now user may select one or more of those fields and define a search term across those fields that he/she selected, the challenge is here, how we can say elastic that consider field_name to match those fields that user selected, then search in old_value, and new_value.
for example if user select curriculum_name_en and curriculum_name_pr and then want to search for xy inside old_value and new_value fields of those documents that their field_name is above fields.
how we can do that?
The idea with this requirement is that you need to make something like: the query needs to match new_value and/or old_value only if field_name matches a certain value as well. There is no programmatic-like way of saying if this then that.
What I'm suggesting is something like this:
{
"query": {
"bool": {
"must": [
{
"terms": {
"field_name": [
"curriculum_name_en",
"curriculum_name_pr"
]
}
},
{
"multi_match": {
"query": "Test Index",
"fields": ["new_value","old_value"]
}
}
]
}
}
}
So, your if this then that condition is a must statement from a bool query where your if and then branches live inside the must.
This may solve your problem
{
"query": {
"filtered": {
"filter": {
"and": [
{
"query" : {
"terms" : {
"field_name" : [
"curriculum_name_en",
"curriculum_name_pr"
],
"minimum_match" : 1
}
}
},
{
"query" : {
"terms" : {
"new_value" : [
"test", "index"
],
"minimum_match" : 1
}
}
}
]
}
}
}
}
}
I'm using FOSElasticaBundle with Symfony2 on my project and there are entry and user tables on MySQL database and each entry belongs to one user.
I want to get just one entry per a user among the whole entries from the database.
Entries Representation
[
{
"id": 1,
"name": "Hello world",
"user": {
"id": 17,
"username": "foo"
}
},
{
"id": 2,
"name": "Lorem ipsum",
"user": {
"id": 15,
"username": "bar"
}
},
{
"id": 3,
"name": "Dolar sit amet",
"user": {
"id": 17,
"username": "foo"
}
},
]
Expected result is:
[
{
"id": 1,
"name": "Hello world",
"user": {
"id": 17,
"username": "foo"
}
},
{
"id": 2,
"name": "Lorem ipsum",
"user": {
"id": 15,
"username": "bar"
}
}
]
But it returns all entries on table. I've tried to add an aggregation to my elasticsearch query and nothing changed.
$distinctAgg = new \Elastica\Aggregation\Terms("distinctAgg");
$distinctAgg->setField("user.id");
$distinctAgg->setSize(1);
$query->addAggregation($distinctAgg);
Is there any way to do this via term filter or anything else? Any help would be great. Thank you.
Aggregations are not easy to understand when you are used to MySQL group by.
The first thing, is that aggregations results are not returned in hits, but in aggregations. So when you get the result of your search, you have to get aggregations like that :
$results = $search->search();
$aggregationsResults = $results->getAggregations();
The second thing is that aggregations wont return you the source. With the aggregation of your example, you will only know that you have 1 user with ID 15, and 2 users with ID 15.
E.g. with this query :
{
"query": {
"match_all": {}
},
"aggs": {
"byUser": {
"terms": {
"field": "user.id"
}
}
}
}
Result:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [ ... ]
},
"aggregations": {
"byUser": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 17,
"doc_count": 2
},
{
"key": 15,
"doc_count": 1
}
]
}
}
}
If you want to get results, the same way you would do with a GROUP BY in MySQL, you have to use a top_hits sub-aggregation:
{
"query": {
"match_all": {}
},
"aggs": {
"byUser": {
"terms": {
"field": "user.id"
},
"aggs": {
"results": {
"top_hits": {
"size": 1
}
}
}
}
}
}
Result:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 3,
"max_score": 1,
"hits": [ ... ]
},
"aggregations": {
"byUser": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": 17,
"doc_count": 2,
"results": {
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "test_stackoverflow",
"_type": "test1",
"_id": "1",
"_score": 1,
"_source": {
"id": 1,
"name": "Hello world",
"user": {
"id": 17,
"username": "foo"
}
}
}
]
}
}
},
{
"key": 15,
"doc_count": 1,
"results": {
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test_stackoverflow",
"_type": "test1",
"_id": "2",
"_score": 1,
"_source": {
"id": 2,
"name": "Lorem ipsum",
"user": {
"id": 15,
"username": "bar"
}
}
}
]
}
}
}
]
}
}
}
More informations on this page : https://www.elastic.co/blog/top-hits-aggregation