I'm using elasticsearch version 8.4 with PHP. I created the index articles e and he has all the registers that are present in the correspondent table in database.
I need to do a search with elasticsearch that return the same results that a SQL search would do.
SELECT *
FROM articles
WHERE title LIKE '%Document%'
However, the results are not the same using elasticsearch php. The php code follows:
<?php
require_once "vendor/autoload.php";
use Elastic\Elasticsearch\ClientBuilder;
$client = ClientBuilder::create()
->setHosts(['localhost:9200'])
->setBasicAuthentication('elastic','secret')
->build();
$params = [
'index' => 'articles',
'from' => 0,
'size' => 5000,
'body' => [
'query' => [
'match' => [
'title' => 'Document'
]
]
]
];
if (!empty($results['hits']['hits']))
{
echo "<pre>";
print_r($results['hits']['hits']);
echo "</pre>";
}
I tried 'wildcards' and 'regexp' instead of 'match', but it not worked.
I read these pages of docs to help in this case:
https://www.elastic.co/guide/en/elasticsearch/reference/current/sql-like-rlike-operators.html#sql-like-operator https://www.elastic.co/guide/en/elasticsearch/client/php-api/8.4/search_operations.html
Is there any to reproduce this elasticsearch php code return same results of sql query executed directly in database?
Related
I'm trying to use the Firebase PHP API to update/append a document field's array with a map
I have the following code in Python that works fine
ref = db.collection(u'jobs').document(jobId)
ref.update({
u'messages': firestore.ArrayUnion([{
u'category': u'0',
u'message': u'TEST',
u'sender': u'TEAM',
}])
})
Though when I try to replicate it in PHP, it doesn't work. I tried a lot of different ways to view the errors, but all I get is 500 INTERNAL SERVER ERROR.
require 'vendor/autoload.php';
use Google\Cloud\Firestore\FirestoreClient;
use Google\Cloud\Firestore\FieldValue;
$firestore = new FirestoreClient([
'projectId' => 'XXX-XX',
'credentials' => 'key.json'
]);
$jobId = "XXXX";
$docRef = $firestore->collection('jobs')->document($jobId);
$docRef->update([
'messages' => FieldValue::arrayUnion([{
'category' : '0',
'message' : 'TEST',
'sender' : 'TEAM',
}])
]);
I looked up samples of Array Union in PHP, adding data with PHP. I've tried a lots of variations of : or => or arrayUnion([]) or arrayUnion({[]}) to no avail.
Any idea what is causing this?
Looks like there's a few things going wrong here.
First, PHP uses arrays for both maps and "normal" arrays. There is no object literal ({}) in PHP. Array values are specified using the => operator, not :.
Second, DocumentReference::update() accepts a list of values you wish to change, with the path and value. So an update call would look like this:
$docRef->update([
['path' => 'foo', 'value' => 'bar']
]);
You can use DocumentReference::set() for the behavior you desire. set() will create a document if it does not exist, where update() will raise an error if the document does not exist. set() will also replace all the existing fields in the document unless you specify merge behavior:
$docRef->set([
'foo' => 'bar'
], ['merge' => true]);
Therefore, your code can be re-written as either of the following:
$jobId = "XXXX";
$docRef = $firestore->collection('jobs')->document($jobId);
$docRef->set([
'messages' => FieldValue::arrayUnion([[
'category' => '0',
'message' => 'TEST',
'sender' => 'TEAM',
]])
], ['merge' => true]);
$jobId = "XXXX";
$docRef = $firestore->collection('jobs')->document($jobId);
$docRef->update([
[
'path' => 'messages', 'value' => FieldValue::arrayUnion([[
'category' => '0',
'message' => 'TEST',
'sender' => 'TEAM',
]])
]
]);
One final thing to note: arrayUnion will not append duplicate values. So if the value you provide (including all keys and values in the nested map) already exists, it will not be appended to the document.
If you haven't already, turn up error reporting in your development environment to receive information about why your code is failing. PHP will inform you about the parse errors your snippet included, and the Firestore client will give you errors which can often be quite useful.
From Firebase Documentation:
$cityRef = $db->collection('cities')->document('DC');
// Atomically add a new region to the "regions" array field.
$cityRef->update([
['path' => 'regions', 'value' => FieldValue::arrayUnion(['greater_virginia'])]
]);
I would assume you would like something like this:
$docRef = $firestore->collection('jobs')->document($jobId);
// Atomically add new values to the "messages" array field.
$docRef->update([
['path' => 'messages', 'value' => FieldValue::arrayUnion([[
'category' : '0',
'message' : 'TEST',
'sender' : 'TEAM',
]])]
]);
I'm currently working with the current PHP MongoDB\Driver .
I need to use an geoNear query to fetch points from my current location. The required 2dsphere index is already set, the query works in the console and delivers multiple results:
db.runCommand({geoNear: 'pois', near: [ 52.264633, 6.12485 ], spherical: true, maxDistance: 1000, distanceField: 'distance'})
Since the previous methods are deprecated, I can't use the old aggregate functions.
I'm now trying to find the right way to build the query I need with the current Query or Command classes.
What I've tried is the following:
$query = array(
'geoNear' => 'pois',
"near" => array(
52.264633,
6.12485
),
"spherical" => true,
"maxDistance" => 1000,
"distanceField" => "distance"
);
$cmd = new MongoDB\Driver\Command($query);
$returnCursor = $this->conn->executeCommand("database.pois", $cmd);
$arrReturn = $returnCursor->toArray();
return $arrReturn;
If I use this, I will return this Runtime Error:
"exception": [
{
"type": "MongoDB\\Driver\\Exception\\RuntimeException",
"code": 18,
"message": "Failed to decode document from the server."
}
]"
I couldn't find a solution for my case and also I couldn't find more information to this error.
If I change the Command up to a Query, the execution doesn't fail, but there are no results.
My mongodb is on the version 3.2, my PHP version is PHP Version 7.0.16-4+deb.sury.org~trusty+1 and the mongodb Exension is version 1.2.3
You can use the aggregate in the following way with new driver.
$pipeline = array(array(
'$geoNear'=> array(
'near' => array(
52.264633,
6.12485
),
'spherical' => true,
'maxDistance' => 1000,
'distanceField' => "distance"
)));
$cmd = new \MongoDB\Driver\Command([
'aggregate' => 'pois',
'pipeline' => $pipeline
]);
$returnCursor = $this->conn->executeCommand("database", $cmd);
$arrReturn = $returnCursor->toArray();
There is also a Library from Mongo that expands the default functionality of the driver to make it a little more user friendly
but as its not built into the php website its easy to miss
MongoDB\Collection::aggregate($pipeline, $options)
where
$pipeline = array(array(
'$geoNear'=> array(
'near' => array(
52.264633,
6.12485
),
'spherical' => true,
'maxDistance' => 1000,
'distanceField' => "distance"
)
));
I have a field called url that is set to not_analyzed when I index it:
'url' => [
'type' => 'string',
'index' => 'not_analyzed'
]
Here is my method to determine if a URL already exists in the index:
public function urlExists($index, $type, $url) {
$params = [
'index' => $index,
'type' => $type,
'body' => [
'query' => [
'match' => [
'url' => $url
]
]
]
];
$results = $this->client->count($params);
return ($results['count'] > 0);
}
This seems to work fine however I can't be 100% sure this is the correct way to find an exact match, as reading the docs another way to do the search is with the params like:
$params = [
'index' => $index,
'type' => $type,
'body' => [
'query' => [
'filtered' => [
'filter' => [
'term' => [
'url' => $url
]
]
]
]
]
];
My question is would either params work the same way for a not_analyzed field?
The second query is the right approach. term level queries/filters should be used for exact match. Biggest advantage is caching. Elasticsearch uses bitset for this and you will get quicker response time with subsequent calls.
From the Docs
Exclude as many document as you can with a filter, then query just the
documents that remain.
Also if you observe your output, you will find that _score of every document is 1 as scoring is not applied to filters, same goes for highlighting but with match query you will see different _score. Again From the Docs
Keep in mind that once you wrap a query as a filter, it loses query
features like highlighting and scoring because these are not features
supported by filters.
Your first query uses match which is basically used for analyzed fields e.g when you want both Google and google to match all your documents containing google(case insensitive) match queries are used.
Hope this helps!!
I am attempting to make a Google like search using ElasticSearch and PHP. I have been reading a ElasticSearch book and I think I was to use simple_query_string query type that can take the keywords (or phrase) from a search box and try to find some of all of the terms entered.
I am using the PHP ElasticSearch library in my project and after connecting to my server I am trying to pass a $client->search($params) to my search to return a result.
I have this as my params array
$params =
[
'index' => 'letsmeetup',
'type' => 'person',
'body' =>
[
'query' =>
[
'simple_query_string' =>
[
'query' => $keywords,
'fields' => [
"first_name","last_name","bio","username","email_address","interests","skills"
]
]
]
]
];
I used a phrase like 'People who love php' and I get results. I tried 'real time web' (which is in my bio) as I get the correct result.
Problem is when I try 'Er', knowing there is a first_name of "Erin", or 'Neo', which I have "neo4j" in my bio, it's returning no results. Do I have the params array correct?
You need to use wildcards for these kind of queries
{
"query": {
"query_string": {
"fields": [ "first_name","last_name","bio","username","email_address","interests","skills"],
"query": "Er*"
}
}
}
This will match "Erin", "Eric", "Error" and so on.
You can find more information about Query String Syntax and wildcards here. https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#query-string-syntax
I hope this helps!
In my example code I am using the php client library, but it should be understood by anyone familiar with elasticsearch.
I'm using elasticsearch to create an index where each document contains an array of nGram indexed authors. Initially, the document will have a single author, but as time progresses, more authors will be appended to the array. Ideally, a search could be executed by an author's name, and if any of the authors in the array get matched, the document will be found.
I have been trying to use the documentation here for appending to the array and here for using the array type - but I have not had success getting this working.
First, I want to create an index for documents, with a title, array of authors, and an array of comments.
$client = new Client();
$params = [
'index' => 'document',
'body' => [
'settings' => [
// Simple settings for now, single shard
'number_of_shards' => 1,
'number_of_replicas' => 0,
'analysis' => [
'filter' => [
'shingle' => [
'type' => 'shingle'
]
],
'analyzer' => [
'my_ngram_analyzer' => [
'tokenizer' => 'my_ngram_tokenizer',
'filter' => 'lowercase',
]
],
// Allow searching for partial names with nGram
'tokenizer' => [
'my_ngram_tokenizer' => [
'type' => 'nGram',
'min_gram' => 1,
'max_gram' => 15,
'token_chars' => ['letter', 'digit']
]
]
]
],
'mappings' => [
'_default_' => [
'properties' => [
'document_id' => [
'type' => 'string',
'index' => 'not_analyzed',
],
// The name, email, or other info related to the person
'title' => [
'type' => 'string',
'analyzer' => 'my_ngram_analyzer',
'term_vector' => 'yes',
'copy_to' => 'combined'
],
'authors' => [
'type' => 'list',
'analyzer' => 'my_ngram_analyzer',
'term_vector' => 'yes',
'copy_to' => 'combined'
],
'comments' => [
'type' => 'list',
'analyzer' => 'my_ngram_analyzer',
'term_vector' => 'yes',
'copy_to' => 'combined'
],
]
],
]
]
];
// Create index `person` with ngram indexing
$client->indices()->create($params);
Off the get go, I can't even create the index due to this error:
{"error":"MapperParsingException[mapping [_default_]]; nested: MapperParsingException[No handler for type [list] declared on field [authors]]; ","status":400}
HAD this gone successfully though, I would plan to create an index, starting with empty arrays for authors and title, something like this:
$client = new Client();
$params = array();
$params['body'] = array('document_id' => 'id_here', 'title' => 'my_title', 'authors' => [], 'comments' => []);
$params['index'] = 'document';
$params['type'] = 'example_type';
$params['id'] = 'id_here';
$ret = $client->index($params);
return $ret;
This seems like it should work if I had the desired index to add this structure of information to, but what concerns me would be appending something to the array using update. For example,
$client = new Client();
$params = array();
//$params['body'] = array('person_id' => $person_id, 'emails' => [$email]);
$params['index'] = 'document';
$params['type'] = 'example_type';
$params['id'] = 'id_here';
$params['script'] = 'NO IDEA WHAT THIS SCRIPT SHOULD BE TO APPEND TO THE ARRAY';
$ret = $client->update($params);
return $ret;
}
I am not sure how I would go about actually appending a thing to the array and making sure it's indexed.
Finally, another thing that confuses me is how I could search based on any author in the array. Ideally I could do something like this:
But I'm not 100% whether it will work. Maybe there is something fundemental about elasticsearch that I am not understanding. I am completely new to so any resources that will get me to a point where these little details don't hang me up would be appreciated.
Also, any direct advice on how to use elasticsearch to solve these problems would be appreciated.
Sorry for the big wall of text, to recap, I am looking for advice on how to
Create an index that supports nGram analysis on all elements of an array
Updating that index to append to the array
Searching for the now-updated index.
Thanks for any help
EDIT: thanks to #astax, I am now able to create the index and append to the value as a string. HOWEVER, there are two problems with this:
the array is stored as a string value, so a script like
$params['script'] = 'ctx._source.authors += [\'hello\']';
actually appends a STRING with [] rather than an array containing a value.
the value inputted does not appear to be ngram analyzed, so a search like this:
$client = new Client();
$searchParams['index'] = 'document';
$searchParams['type'] = 'example_type';
$searchParams['body']['query']['match']['_all'] = 'hello';
$queryResponse = $client->search($searchParams);
print_r($queryResponse); // SUCCESS
will find the new value but a search like this:
$client = new Client();
$searchParams['index'] = 'document';
$searchParams['type'] = 'example_type';
$searchParams['body']['query']['match']['_all'] = 'hel';
$queryResponse = $client->search($searchParams);
print_r($queryResponse); // NO RESULTS
does not
There is no type "list" in elasticsearch. But you can use "string" field type and store array of values.
....
'comments' => [
'type' => 'string',
'analyzer' => 'my_ngram_analyzer',
'term_vector' => 'yes',
'copy_to' => 'combined'
],
....
And index a document this way:
....
$params['body'] = array(
'document_id' => 'id_here',
'title' => 'my_title',
'authors' => [],
'comments' => ['comment1', 'comment2']);
....
As for the script for apending an element to array, this answer may help you - Elasticsearch upserting and appending to array
However, do you really need to update the document? It might be easier to just reindex it as this is exactly what Elasticsearch does internally. It reads the "_source" property, does the required modification and reindexes it. BTW, this means that "_source" must be enabled and all properties of the document should be included into it.
You also may consider storing comments and authors (as I understand these are authors of comments, not the document authors) as child document in ES and using "has_child" filter.
I can't really give you specific solution, but strongly recommend installing Marvel plugin for ElasticSearch and use its "sense" tool to check how your overall process works step by step.
So check if your tokenizer is properly configured by running tests as described at http://www.elastic.co/guide/en/elasticsearch/reference/1.4/indices-analyze.html.
Then check if your update script is doing what you expect by retrieving the document by running GET /document/example_type/some_existing_id
The authors and comments should be arrays, but not strings.
Finally perform the search:
GET /document/_search
{
'query' : {
'match': { '_all': 'hel' }
}
}
If you're building the query yourself rather than getting it from the user, you may use query_string with placeholders:
GET /document/_search
{
'query' : {
'query_string': {
'fields': '_all',
'query': 'hel*'
}
}
}