Elasticsearch Completion - php

I have a elasticsearch index which i update every 10 minutes via cronjob. In this index i have a completion field which works as expected.
But i have one little problem. Lets say i have a "article" field where i change a value from "a" to "b". After 10 minutes the index is been updated and the document which holds article "a" is been updated to article "b". Everything as expected.
But my completion field now holds both values. "a" and "b" both with the same id.
How can this happen?

Mapping:
'suggest' => array(
'type' => 'completion',
'payloads' => true,
'preserve_separators' => false,
'search_analyzer' => 'standard',
'index_analyzer' => 'standard'
),
How i set the field:
'suggest' => array(
'input' => array(
$result["Name"],
$result["Name"],
$result["Name2"],
$result["Name3"],
$result["Name4"],
$result["Name5"]
),
'output' => $result["Name"].' (' . $result["Name1"].', '.$result["Name2"].')',
'payload' => array(
'id' => $result["ID"]
)
)

Found the answer in the docs.
The suggest data structure might not reflect deletes on documents immediately. You may need to do an Optimize for that. You can call optimize with the only_expunge_deletes=true to only cater for deletes or alternatively call a Merge operation.

Related

mongoDB, PHP update specific value not all the values

I am having a problem in updating values i get from web service ..
$collection = $modb->$table;
$collection->update(array("id" => (int)$row['id']),
array('$set' => array(
"user_id" => (int)$post_data_array['user_id'],
"story" => (int)$post_data_array['story'],
"surprize_sub1" => (int)$post_data_array['surprize_sub1'],
"surprize_sub2" => (int)$post_data_array['surprize_sub2'],
"surprize_sub3" => (int)$post_data_array['surprize_sub3'],
"exr_solve" => (int)$post_data_array['exr_solve'],
"exr_assessmnt" => (int)$post_data_array['exr_assessmnt'],
"exr_refresh" => (int)$post_data_array['exr_refresh'],
"sound_control" => (int)$post_data_array['sound_control'],
"clock_control" => (int)$post_data_array['clock_control'],
"switch_user" => (int)$post_data_array['switch_user'],
"exr_print" => (int)$post_data_array['exr_print'],
"write_on_wall" => (int)$post_data_array['write_on_wall'],
"switch_letter" => (int)$post_data_array['switch_letter'],
"view_controls" => (int)$post_data_array['view_controls'],
)));
I get these values from end users.. i want the specific field sent to be updated without loosing all the rest of data ..
in this code only sent data is set while removing the rest .. i want to change only sent ones by keeping the rest as they are, please advice
you need to use updateOne instead of update .
updateOne
Use the MongoDB\Collection::updateOne() method to update a single document matching a filter.
$collection = $modb->$table;
$collection->updateOne(array("id" => (int)$row['id']),
array('$set' => array(
// .... array elements
)));

Elasticsearch. Data update and keeping history of previous values of fields

Part of my mapping is:
"current_price" => ["type" => "float"],
"price_history" => [
"type" => "nested",
"properties" => [
"date" => ["type" => "date"],
"value" => ["type" => "float"]
]
As you can see I keep in storage current price of goods and all the previous values. First thing I would like to notice is when I create goods in a very first time, I have no history, of course. That's why when I create goods, I do not use price_history at all, although it exists in my mapping.
$params = [
'index' => config('storesettings.esIndex'),
'type' => config('storesettings.esType'),
'id' => $id,
'body' => [
...
"current_price" => $request->get('current_price'),
...
]
];
When I edit goods, I change the price. In this case I need to move the current price to archive, moving it to price_history field. And then I need to replace current name. The question is about price_history field. I get previous value ($goods['_source']['price_history']) then I add to this array current_name. Everything is fine when I already have some history. But if I have not, then I get the error 'Undefined index: price_history'. In this case I should do checking: if(isset($goods['_source']['price_history'])). Is it normal? In relational databases I would have an empty array, but in Elasticsearch I haven't and I must do array level (so to speak) checking. How to handle such cases? Maby I should add an epmty array to price_history when I create goods?..

How to add checkbox to stwe/DatatablesBundle for symfony2

I am using https://github.com/stwe/DatatablesBundle for my symfony2 application. It's still buggy but works great. Now I want in my table I have id, name for the id. I want to add check box so that I can select them by their id. I've searched enough but couldn't find any solution. Can anyone help me how can I add check box to the column id as select?
One thing : my 'serverSide': false.
$this->columnBuilder
->add("id", "column", array("title" => "Id","type" => "checkbox",))
->add("name", "column", array("title" => "Name",))
this is my code for generating the column.
ok, I found the solution, im not sure if this is the best way to solve the problem but still i can workout my issue.
->add(null, "multiselect", array(
"attributes" => array(
"name" => "check", ),
"actions" => array(
array(
"route" => "artist_show",
"route_parameters" => array(
"id" => "id"
),
"label" => "Select",
)
)))
adding the multi select option solve the necessity of have a check box

Optimizing query with large result set

I have a CakePHP model, let's call it Thing which has an associated model called ItemView. ItemView represents one page view of the Thing item. I want to display how many times Thing has been viewed, so I do the following in my view:
<?php echo count($thing['ItemView']); ?>
This works, however as time goes on the result set of this query is going to get huge, as it's currently being returned like so:
array(
'Thing' => array(
'id' => '1',
'thing' => 'something'
),
'ItemView' => array(
(int) 0 => array(
'id' => '1',
'thing_id' => 1,
'created' => '2013-09-21 19:25:39',
'ip_address' => '127.0.0.1'
),
(int) 1 => array(
'id' => '1',
'thing_id' => 1,
'created' => '2013-09-21 19:25:41',
'ip_address' => '127.0.0.1'
),
// etc...
)
)
How can I adapt the model find() to retrieve something like so:
array(
'Thing' => array(
'id' => '1',
'thing' => 'something',
'views' => 2
)
)
without loading the entire ItemView relation into memory?
Thanks!
So it's pretty straight forward, we can make use of countercache - Cake does the counting for you whenever a record is added into/deleted fromItemView:
Nothing to change in your Thing.php model
Add a new INT column views in your things table.
In your ItemView.php model, add counterCache like this:
public $belongsTo = array(
'Thing' => array(
'counterCache' => 'views'
)
);
Then next time when you do addition/deletion via ItemView, Cake will automatically recalculate the counting and cache into views for you, so the next time when you do the query, you also need to make sure you specify recursive = -1 as what #Paco Car has suggested in his answer:
$this->Thing->recursive = -1;
$this->Thing->find(...); //this will returns array of Thing + the field "views"
// --- OR ---
$this->Thing->find(array(
'conditions' => array(
//... your usual conditions here
),
//... fields, order... etc
//this will make sure the recursive applies to this call, once only.
'recursive' => -1
);

understanding ElasticSearch routing

I am trying to use the elasticsearch routing mapping to speed up some queries, but I am not getting the expected result set (not worried about the query performance just yet)
I am using Elastic to set up my mapping:
$index->create(array('number_of_shards' => 4,
'number_of_replicas' => 1,
'mappings'=>array("country"=>array("_routing"=>array("path"=>"countrycode"))),
'analysis' => array(
'analyzer' => array(
'indexAnalyzer' => array(
'type' => 'keyword',
'tokenizer' => 'nGram',
'filter' => array('shingle')
),
'searchAnalyzer' => array(
'type' => 'keyword',
'tokenizer' => 'nGram',
'filter' => array('shingle')
)
)
) ), true);
If I understand correctly, what should happen is that each result should now have a field called "countrycode" with the value of "country" in it.
The results of _mapping look like this:
{"postcode":
{"postcode":
{"properties":
{
"area1":{"type":"string"},
"area2":{"type":"string"},
"city":{"type":"string",
"include_in_all":true},
"country":{"type":"string"},
"country_iso":{"type":"string"},
"country_name":{"type":"string"},
"id":{"type":"string"},
"lat":{"type":"string"},
"lng":{"type":"string"},
"location":{"type":"geo_point"},
"region1":{"type":"string"},
"region2":{"type":"string"},
"region3":{"type":"string"},
"region4":{"type":"string"},
"state_abr":{"type":"string"},
"zip":{"type":"string","include_in_all":true}}},
"country":{
"_routing":{"path":"countrycode"},
"properties":{}
}
}
}
Once all the data is in the index if I run this command:
http://localhost:9200/postcode/_search?pretty=true&q=country:au
it responds with 15740 total items
what I was expecting is that if I run the query like this:
http://localhost:9200/postcode/_search?routing=au&pretty=true
Then I was expecting it to respond with 15740 results
instead it returns 120617 results, which includes results where country is != au
I did note that the number of shards in the results went from 4 to 1, so something is working.
I was expecting that in the result set there would be an item called "countrycode" (from the rounting mapping) which there isn't
So I thought at this point that my understand of routing was wrong. Perhaps all the routing does is tell it which shard to look in but not what to look for? in other words if other country codes happen to also land in that particular shard, the way those queries are written will just bring back all records in that shard?
So I tried the query again, this time adding some info to it.
http://localhost:9200/postcode/_search?routing=AU&pretty=true&q=country:AU
I thought by doing this it would force the query into giving me just the AU place names, but this time it gave me only 3936 results
So I Am not quite sure what I have done wrong, the examples I have read show the queries changing from needing a filter, to just using match_all{} which I would have thought would only being back ones matching the au country code.
Thanks for your help in getting this to work correctly.
Almost have this working, it now gives me the correct number of results in a single shard, however the create index is not working quite right, it ignores my number_of_shards setting, and possibly other ones too
$index = $client->getIndex($indexname);
$index->create(array('mappings'=>array("$indexname"=>array("_routing"=>array("required"=>true))),'number_of_shards' => 6,
'number_of_replicas' => 1,
'analysis' => array(
'analyzer' => array(
'indexAnalyzer' => array(
'type' => 'keyword',
'tokenizer' => 'nGram',
'filter' => array('shingle')
),
'searchAnalyzer' => array(
'type' => 'keyword',
'tokenizer' => 'nGram',
'filter' => array('shingle')
)
)
) ), true);
I can at least help you with more info on where to look:
http://localhost:9200/postcode/_search?routing=au&pretty=true
That query does indeed translate into "give me all documents on the shard where documents for country:AU should be sent."
Routing is just that, routing ... it doesn't filter your results for you.
Also i noticed you're mixing your "au"s and your "AU"s .. that might mix things up too.
You should try setting required on your routing element to true, to make sure that your documents are actually stored with routing information when being indexed.
Actually to make sure your documents are indexed with proper routing explicitly set the route to lowercase(countrycode) when indexing documents. See if that helps any.
For more information try reading this blog post:
http://www.elasticsearch.org/blog/customizing-your-document-routing/
Hope this helps :)

Categories