How to search elasticsearch case insensitive - php

I am using php's client library for elasticsearch. I'd like to create an index that indexes a person's id and his name, and allows the user to search for names in a very flexible way (case insensitive, search for partial names, etc.
Here is a code snippet of what I have so far, annotated with comments for convenience
<?php
require_once(__DIR__ . '/../init.php');
$client = new Elasticsearch\Client();
$params = [
'index' => 'person',
'body' => [
'settings' => [
// Simple setings for now, single shard
'number_of_shards' => 1,
'number_of_replicas' => 0,
'analysis' => [
'filter' => [
'shingle' => [
'type' => 'shingle'
]
],
'analyzer' => [
'my_ngram_analyzer' => [
'tokenizer' => 'my_ngram_tokenizer',
]
],
// Allow searching for partial names with nGram
'tokenizer' => [
'my_ngram_tokenizer' => [
'type' => 'nGram',
'min_gram' => 1,
'max_gram' => 15,
'token_chars' => ['letter', 'digit']
]
]
]
],
'mappings' => [
'_default_' => [
'properties' => [
'person_id' => [
'type' => 'string',
'index' => 'not_analyzed',
],
// The name of the person
'value' => [
'type' => 'string',
'analyzer' => 'my_ngram_analyzer',
'term_vector' => 'yes',
'copy_to' => 'combined'
],
]
],
]
]
];
// Create index `person` with ngram indexing
$client->indices()->create($params);
// Index a single person using this indexing scheme
$params = array();
$params['body'] = array('person_id' => '1234', 'value' => 'Johnny Appleseed');
$params['index'] = 'person';
$params['type'] = 'type';
$params['id'] = 'id';
$ret = $client->index($params);
// Get that document (to prove it's in there)
$getParams = array();
$getParams['index'] = 'person';
$getParams['type'] = 'type';
$getParams['id'] = 'id';
$retDoc = $client->get($getParams);
print_r($retDoc); // success
// Search for that document
$searchParams['index'] = 'person';
$searchParams['type'] = 'type';
$searchParams['body']['query']['match']['value'] = 'J';
$queryResponse = $client->search($searchParams);
print_r($queryResponse); // FAILURE
// blow away index so that we can run the script again immediately
$deleteParams = array();
$deleteParams['index'] = 'person';
$retDelete = $client->indices()->delete($deleteParams);
I have had this search feature working at times, but I've been fussing with the script to get the case insensitive feature working as expected, and in the process, the script now fails to find any person with a J or j used as the query value to match.
Any ideas what might be going on here?

To fix the case insensitive bit, I added
'filter' => 'lowercase',
to my ngram analyzer.
Also, the reason why it was failing to begin with is that, while using php's client library, you can't create the index then search it in the same script. My guess is something async is going on here. So create the index in one script and search it in another script, it should work.

Related

Validate that JSON array has one associative array with fixed integer value

I am trying to validate some JSON using Opis's package. I am trying to validate that an array has at least one associative array with an id of value 1. Here is the code I've got:
$json = [
[
'id' => 1,
],
[
'id' => 2,
],
[
'id' => 3
]
];
$rules = [
'type' => 'array',
'contains' => [
'type' => 'array',
'properties' => [
'id' => [
'type' => 'integer',
'const' => 1,
],
],
'required' => ['id']
],
'minContains' => 1,
];
$validated = Common::validateJSON($json, json_encode($rules));
and here is the validateJSON method code:
public static function validateJSON($json, $rules)
{
$validator = new Validator();
// Validate
$result = $validator->validate($json, $rules);
if ($result->isValid()) {
return true;
}
$errorMessages = [];
if ($result->hasError()) {
$formatter = new ErrorFormatter();
$errorMessages[] = $formatter->format($result->error());
}
return $errorMessages;
}
so, in this case $validated returns:
array:1 [
0 => array:1 [
"/" => array:1 [
0 => "At least 1 array items must match schema"
]
]
]
changing $rules to this:
$rules = [
'type' => 'array',
'contains' => [
'type' => 'array',
],
'minContains' => 1,
];
returns the same result which is weird for me.
Changing const to any number doesn't change what is returned. So, my guess is that I am doing something wrong but I don't know what.
I've been googling various things nothing helped. I've been looking at the JSON schema site, particularly here but I haven't figured it out.
Before validating, as I am not json decoding the data as it is not coming from an http request, do this:
$json = json_encode($json);
$json = json_decode($json); // this, I think, will turn associative arrays into objects which makes it work
and the second type must be object.

Create Associative Array with Foreach, Insert into existing Associative Array

Hello.
I currently have a problem with the AWS Route-53 API. To create a record you need to call a function, which itself needs an array of inputs.
I want to create a record set here and for that I have some POST values. One of them, $_POST['record_value'], is a textarea and has multiple lines. I loop through them. This is to enable multiple values for one record. The code is as follows when you hardcode it as one value in ResourceRecords;
$result = $this->route53->changeResourceRecordSets([
'ChangeBatch' => [
'Changes' => [
[
'Action' => 'CREATE',
'ResourceRecordSet' => [
'Name' => $recordName,
'ResourceRecords' => [
[
'Value' => $recordValue
],
],
'TTL' => $recordTtl,
'Type' => $recordType,
],
],
],
'Comment' => 'Routing Record Set',
],
'HostedZoneId' => $this->zone,
]);
Hower. I want to make ResourceRecords dynamically. For every line in the textarea I need a new set of the following part of the code;
[
'Value' => $recordValue
],
What I thought is the following;
$newData = [];
foreach(explode("\r\n", $recordValue) as $valLine) {
$newData[] = ["Value" => $valLine];
}
$result = $this->route53->changeResourceRecordSets([
'ChangeBatch' => [
'Changes' => [
[
'Action' => 'CREATE',
'ResourceRecordSet' => [
'Name' => $recordName,
'ResourceRecords' => [
$newData
],
'TTL' => $recordTtl,
'Type' => $recordType,
],
],
],
'Comment' => 'Routing Record Set',
],
'HostedZoneId' => $this->zone,
]);
However, this seems to return an exception: Found 1 error while validating the input provided for the ChangeResourceRecordSets operation:↵[ChangeBatch][Changes][0][ResourceRecordSet][ResourceRecords][0] must be an associative array. Found array(1).
Am I building the array wrong or am I doing this wrong alltogether?
$newData is already an array, you don't need to wrap it in another array.
'ResourceRecords' => $newData,

Can't update map column of DynamoDB table

I am currently developing a skill for Amazon's echo dot which requires the use of persistent data. I ran into an issue when developing a web interface for my skill where I was not able to easily update the mapAttr column of the DynamoDB table used by the skill.
I've been trying to work this out for the last 2 days, I've looked everywhere including the documentation but can't seem to find anything that'll help me.
This is the code I am using:
$result = $client->updateItem([
'TableName' => 'rememberThisDBNemo',
'Key' => [
'userId' => [ 'S' => $_SESSION['userDataAsk'] ]
],
'ExpressionAttributeNames' => [
'#attr' => 'mapAttr.ReminderJSON'
],
'ExpressionAttributeValues' => [
':val1' => json_encode($value)
],
'UpdateExpression' => 'SET #attr = :val1'
]);
I have tried many different things so this might be just absolutely wrong, but nothing that I have found has worked.
The table has 2 columns, userId and mapAttr, userId is a string and mapAttr is a map. Originally I thought it was simply a JSON string but it was not like that as when I tried to update it with a JSON string directly it would stop working when read by Alexa.
I am only trying to update 1 out of the 2 attributes of mapAttr. That is ReminderJSON which is a string.
Any help would be appreciated. Thanks.
Try calling updateItem like this
$result = $client->updateItem([
'TableName' => 'rememberThisDBNemo',
'Key' => [
'userId' => [ 'S' => $_SESSION['userDataAsk'] ]
],
'ExpressionAttributeNames' => [
'#mapAttr' => 'mapAttr',
'#attr' => 'ReminderJSON'
],
'ExpressionAttributeValues' => [
':val1' => ['S' => json_encode($value)]
],
'UpdateExpression' => 'SET #mapAttr.#attr = :val1'
]);
However, please be aware that in order for this to work, attribute mapAttr must already exist. If it doesn't, you'll get ValidationException saying The document path provided in the update expression is invalid for update...
As a workaround, you may want to add a ConditionExpression => 'attribute_exists(mapAttr)' to your params, catch possible exception, and then perform another update adding a new attribute mapAttr:
try {
$result = $client->updateItem([
'TableName' => 'rememberThisDBNemo',
'Key' => [
'userId' => [ 'S' => $_SESSION['userDataAsk'] ]
],
'ExpressionAttributeNames' => [
'#mapAttr' => 'mapAttr'
'#attr' => 'ReminderJSON'
],
'ExpressionAttributeValues' => [
':val1' => ['S' => json_encode($value)]
],
'UpdateExpression' => 'SET #mapAttr.#attr = :val1'
'ConditionExpression' => 'attribute_exists(#mapAttr)'
]);
} catch (\Aws\Exception\AwsException $e) {
if ($e->getAwsErrorCode() == "ConditionalCheckFailedException") {
$result = $client->updateItem([
'TableName' => 'rememberThisDBNemo',
'Key' => [
'userId' => [ 'S' => $_SESSION['userDataAsk'] ]
],
'ExpressionAttributeNames' => [
'#mapAttr' => 'mapAttr'
],
'ExpressionAttributeValues' => [
':mapValue' => ['M' => ['ReminderJSON' => ['S' => json_encode($value)]]]
],
'UpdateExpression' => 'SET #mapAttr = :mapValue'
'ConditionExpression' => 'attribute_not_exists(#mapAttr)'
]);
}
}

PHP wont JSON encode multiple Array()'s with same name

I have the structure below which I need to turn into json_encoded. To finally get it decoded and get an object.
This will allow me to have multiple objects with the name message and loop through them and process each message individually.
However when encoded, php will only encode the key and one of the message arrays—the last one.
$setup = [
'key' => 'demo-7hn3fh83un3yhvfjvnjgknfhjnvf',
'message' => [
'number' => [
'+39XXXXXXXX',
'+34XXXXXXXX',
'+49XXXXXXXX'
],
'text' => 'Sample msg 123...',
],
'message' => [
'number' => [
'+50XXXXXXXX',
'+50XXXXXXXX'
],
'text' => 'Something...',
]
];
Is there a way to encode multiple arrays with the same name?
You've overlooked the root issue:
$foo = [
'bar' => 1,
'bar' => 2,
'bar' => 3,
];
var_export($foo);
array (
'bar' => 3,
)
Thanks for the tips everyone. I ended up modifying the structure like below...
The reason why I am going with a structure like this is cause it allows me to submit multiple messages to multiple users with a single request.
$setup = [
'key' => 'demo-7hn3fh83un3yhvfjvnjgknfhjnvf',
'message' => [
[
'number' => [
'+39XXXXXXXX',
'+34XXXXXXXX',
'+49XXXXXXXX'
],
'text' => 'Sample msg 123...'
],
[
'number' => [
'+50XXXXXXXX',
'+50XXXXXXXX'
],
'text' => 'Something...'
]
]
];

Highlighting does not work in Elasticsearch and PHP

I've just downloaded and installed the last version of Elasticsearch on my Windows machine. I did my first search queries and everything seemed to work ok. However. when I try to highlight the search results, I fail. So, this is how my query looks like:
$params = [
'index' => 'test_index',
'type' => 'test_index_type',
'body' => [
'query' => [
'bool' => [
'should' => [ 'match' => [ 'field1' => '23' ] ]
]
],
'highlight' => [
'pre_tags' => "<em>",
'post_tags' => "</em>",
'fields' => (object)Array('field1' => new stdClass),
'require_field_match' => false
]
]
]
$res = $client->search($params);
On the whole the query itself works nice - the results are filtered. In the console I see, that all documents indeed contain "23" value in their field1 field. However, these tags - <em></em> are simply not added to the result. What I see is just the raw value in field1 like "some text 23", "23 another text". It is not what I expect to see - "some text <em>23</em>", "<em>23</em> another text". So, what is wrong with that and how can I fix it?
From the manual:
The value of pre_tags and post_tags should be an array (however if you don't want to change the em tags you can ignore them, they already set as default).
The fields value should be an array, key is the field name and the value is an array with the field options.
Try this fix:
$params = [
'index' => 'test_index',
'type' => 'test_index_type',
'body' => [
'query' => [
'bool' => [
'should' => [ 'match' => [ 'field1' => '23' ] ]
]
],
'highlight' => [
// 'pre_tags' => ["<em>"], // not required
// 'post_tags' => ["</em>"], // not required
'fields' => [
'field1' => new \stdClass()
],
'require_field_match' => false
]
]
];
$res = $client->search($params);
var_dump($res['hits']['hits'][0]['highlight']);
update
Did a double check, the value of the field in the fields array should be an object (which is a requirement, not exactly the same as other options).
The pre/post_tags can also be strings (and not array).
Did you check the correct response? $res['hits']['hits'][0]['highlight']
The important thing to notice is that the highligted results goes into the highlight array - $res['hits']['hits'][0]['highlight'].

Categories