Search by term giving empty result using elastica - php

Using Elastica - Elasticsearch PHP Client. There are so many fields but I want to search in "Screen_name" field. What I have to do for it. I used term but without success. Result is empty array
Here is the code.
// Load index (database)
$elasticaIndex = $elasticaClient->getIndex('twitter_json');
//Load type (table)
$elasticaType = $elasticaIndex->getType('tweet_json');
$elasticaFilterScreenName = new \Elastica\Filter\Term();
$elasticaFilterScreenName->setTerm('screen_name', 'sohail');
//Search on the index.
$elasticaResultSet = $elasticaIndex->search($elasticaFilterScreenName);
var_dump($elasticaResultSet); exit;
$elasticaResults = $elasticaResultSet->getResults();
$totalResults = $elasticaResultSet->getTotalHits();

It's hard to say without knowing your mapping, but there is a good chance the document property "screen_name" does not contain the term "sohail".
Try using a Match query or a Query_String.
"Term" has special meaning in ElasticSearch. A Term is the base, atomic unit of an index. Terms are generated after your text is run through an analyzer. If "screen_name" has an analyzer associated with the index, "sohail" is being modified in some capacity before being saved into the index.

Related

How do I make a Case Insensitive, Partial Text Search Engine that uses Regex with MongoDB and PHP?

I'm trying to improve the search bar in my application. If a user types "Titan" into the search bar right now, the application will retrieve the movie "Titanic" from MongoDB every time I use the following regex function:
require 'dbconnection.php';
if ($_SERVER["REQUEST_METHOD"] == "POST") {
$input= $_REQUEST['input'];
$query=$collection->find(['movie' => new MongoDB\BSON\Regex($input)]);
}
I can also make collections case insensitive by creating the following index within the Mongo shell, so if a user types "tiTAnIc" into the search bar, the application will retrieve the movie "Titanic" from MongoDB:
db.createCollection("c1", { collation: { locale: 'en_US', strength: 2 } } )
db.c1.createIndex( { movie: 1 } )
I am not capable of combining these two features at the same time, however. The index above will only remove case sensitivity when I change my query to this:
$query=$collection->find( [ 'movie' => $input] );
If I use the regex query at the top in tandem with the collated index, it will ignore the regex part, so if I type "Titan," it doesn't retrieve anything; if I type "Titanic," however, it will successfully retrieve "Titanic" (because "Titanic" is the exact word stored in my database).
Any advice?
Beware: Regex search on indexed column will affect the performance, as stated at $regex docs:
Case insensitive regular expression queries generally cannot use indexes effectively. The $regex implementation is not collation-aware and is unable to utilize case-insensitive indexes.
Your problem is that MongoDB use prefix (ex: /^acme/) on $regex to lookup index.
For case sensitive regular expression queries, if an index exists for the field, then MongoDB matches the regular expression against the values in the index, which can be faster than a collection scan. Further optimization can occur if the regular expression is a “prefix expression”, which means that all potential matches start with the same string. This allows MongoDB to construct a “range” from that prefix and only match against those values from the index that fall within that range.
So it needs to be changed like this:
$query=$collection->find(['movie' => new MongoDB\BSON\Regex('^'.$input, 'i')]);
I suggest you design your collection more carefully.
Related:
https://stackoverflow.com/a/46228114/6118551
https://scalegrid.io/blog/mongodb-regular-expressions-indexes-performance/

Neo4J, how to query hierarchical data / PHP

i created a category-structure in graph-database "neo4j".
I have nodes and relationships, everything perfect.
I am using Neoxygen Neoclient for PHP to access the data. How can I query the whole category-graph in an efficient way (including structure) from my root-element?
MATCH (a:`Category`{category_id:0})-[r:HAS_CHILD*]->(b:`Category`)
RETURN b,r
My desired structure in PHP is:
- Root
--- Category A
-------- Subcategory AB
--- Category B
--- Category C
-------- Subcategory CA
----------------- Subsubcategory CAA
...
Any ideas?
Thanks in advance.
mr_g
It is totally feasible and user-friendly in Neoxygen's NeoClient.
The first thing to make sure, is that you activate the response formatter :
$client = ClientBuilder::create()
->setAutoFormatResponse(true)
->addConnection(xxx...)
->build();
Secondly, concerning your query I would definitely set a depth limit to avoid memory behaviors depending of your graph connectedness :
MATCH (a:`Category`{category_id:0})-[r:HAS_CHILD*..20]->(b:`Category`)
RETURN b,r
Then, you can send it with the client and benefit that the client will remap the results in a graph structure :
$query = 'MATCH (a:`Category`{category_id:{id}})-[r:HAS_CHILD*..20]->(b:`Category`)'
RETURN b,r';
$children = $client->sendCypherQuery($q, ['id'=>0])->getResult()->getNodes();
Now, each node will know what he has as relationships and relationships know their start and end nodes, example :
$children are nodes in the first depth, so
$rels = $children->getOutboundRelationships();
$nodes = [];
foreach ($rels as $rel) {
$nodes[] = $rel->getEndNode();
}
$nodes holds now all the nodes in depth 2.
There is, currently, no method to get directly the connected nodes from a node object without getting first the relationship, maybe something I can add to the client.
Cypher returns tabular data so if you want to get a tree hierarchy the most efficient way is to return all of the paths from the root to the leaves. A path is a collection/array of node-(relationship-node)* (that is it's an odd number of objects, always containing a node at each end with alternating nodes and relationships). Here is the cypher for how you would do that:
MATCH path(a:`Category`{category_id:0})-[r:HAS_CHILD*]->(b:`Category`)
WHERE NOT(b-[:HAS_CHILD]->())
RETURN b,r
The WHERE clause ensures that you only return all of the paths to the leafs. You could return all categories in the tree which would give you the partial paths too, but all of those partial paths are contained in some longer path so you'd just end up returning more data than you need.
Once you have the paths (I'm not sure what form they show up in Neoclient as I'm not a PHP guy) you can build a hierarchical data structure in memory from the results. If I recall correctly the map/dictionary-type structure in PHP is an associative array.
Schema:
Indexes
ON :Category(category_id) ONLINE (for uniqueness constraint)
Constraints
ON (category:Category) ASSERT category.category_id IS UNIQUE
Query:
MATCH(c:Category) RETURN c

Assign array value to a field

I am working with migration and I am migrating taxonomy terms that the document has been tagged with. The terms are in the document are separated by commas. so far I have managed to separate each term and place it into an array like so:
public function prepareRow($row) {
$terms = explode(",", $row->np_tax_terms);
foreach ($terms as $key => $value) {
$terms[$key] = trim($value);
}
var_dump($terms);
exit;
}
This gives me the following result when I dump it in the terminal:
array(2) {
[0]=>
string(7) "Smoking"
[1]=>
string(23) "Not Smoking"
}
Now I have two fields field_one and field_two and I want to place the value 0 of the array into field_one and value 1 into field_two
e.g
field_one=[0]$terms;
I know this isn't correct and I'm not sure how to do this part. Any suggestions on how to do this please?
If you are only looking to store the string value of the taxonomy term into a different field of a node, then the following code should do the trick:
$node->field_one['und'][0]['value'] = $terms[0];
$node->field_two['und'][0]['value'] = $terms[1];
node_save($node);
Note you will need to load the node first, if you need help with that, comment here and will update my answer.
You are asking specifically about ArrayList and HashMap, but I think to fully understand what is going on you have to understand the Collections framework. So an ArrayList implements the List interface and a HashMap implements the Map interface.
List:
An ordered collection (also known as a sequence). The user of this interface has precise control over where in the list each element is inserted. The user can access elements by their integer index (position in the list), and search for elements in the list.
Map:
An object that maps keys to values. A map cannot contain duplicate keys; each key can map to at most one value.
So as other answers have discussed, the list interface (ArrayList) is an ordered collection of objects that you access using an index, much like an array (well in the case of ArrayList, as the name suggests, it is just an array in the background, but a lot of the details of dealing with the array are handled for you). You would use an ArrayList when you want to keep things in sorted order (the order they are added, or indeed the position within the list that you specify when you add the object).
A Map on the other hand takes one object and uses that as a key (index) to another object (the value). So lets say you have objects which have unique IDs, and you know you are going to want to access these objects by ID at some point, the Map will make this very easy on you (and quicker/more efficient). The HashMap implementation uses the hash value of the key object to locate where it is stored, so there is no guarentee of the order of the values anymore.
You might like to try:
list($field_one, $field_two) = prepareRow($row);
The list function maps entries in an array (in order) to the variables passed by reference.
This is a little fragile, but should work so long as you know you'll have at least two items in your prepareRow result.

How to filter mongodb result for datatables?

I use this script to get the collection of my mongo database: http://datatables.net/development/server-side/php_mongodb
My question is: how to retrieve the rows where foo == 'mystring' only?
As you will notice (on line 29) from the source code in the file the mongo collection has been given the name: $m_collection as such:
$m_collection->find(array('foo' => 'mystring'))
Should work.
If this is not what you are looking for maybe you can be more specific and explain exactly what you are trying to do.
UPDATE
It has come to my attention you might want to instead edit the $searchTermsAll variable to search by this field in a doc. By the looks of it this PHP class links in the same as it would normally for SQL as such you should need to do anyhting special and can just enable filtering on datatables and add the value mystring to the foo field.
However to know if it is the right answer you will need to clarify.
UPDATE 2
A more destructive way of doing this that should keep filtering is to replace line 99 with:
$cursor = $m_collection->find(array_merge($searchTerms,
array('foo' => 'mystring')), $fields);
That will always make sure that your condition is added to the search terms but keeps the users own search terms.
Use as below
$cursor = $collection->find(array("foo" => "mystring"));
Here is more details: http://www.php.net/manual/en/mongo.queries.php

Searching numbers with Zend_Search_Lucene

So why does the first search example below return no results? And any ideas on how to modify the below code to make number searches possible would be much appreciated.
Create the index
$index = new Zend_Search_Lucene('/myindex', true);
$doc->addField(Zend_Search_Lucene_Field::Text('ssn', '123-12-1234'));
$doc->addField(Zend_Search_Lucene_Field::Text('cats', 'Fluffy'));
$index->addDocument($doc);
$index->commit();
Search - NO RESULTS
$index = new Zend_Search_Lucene('/myindex', true);
$results = $index->find('123-12-1234');
Search - WITH RESULTS
$index = new Zend_Search_Lucene('/myindex', true);
$results = $index->find('Fluffy');
First you need to change your text analizer to include numbers
Zend_Search_Lucene_Analysis_Analyzer::setDefault( new Zend_Search_Lucene_Analysis_Analyzer_Common_TextNum() );
Then for fields with numbers you want to use Zend_Search_Lucene_Field::Keyword instead of Zend_Search_Lucene_Field::Text
this will skip the the creation of tokens and saves the value 'as is' into the index. Then you can search by it. I don't know how it behaves with floats ( is probably not going to work for floats 3.0 is not going to match 3) but for natural numbers ( like ids ) works like a charm.
This is an effect of which Analyzer you have chosen.
I believe the default Analyzer will only index terms that match /[a-zA-Z]+/. This means that your SSN isn't being added to the index as a term.
Even if you switched to the text+numeric case insensitive Analyzer, what you are wanting still will not work. The expression for a term is /[a-zA-Z0-9]+/ this would mean your terms added to the index would be 12,123,1234.
If you need 123-12-1234 to be seen as a valid term, you are probably going to need to extend Zend_Search_Lucene_Analysis_Analyzer_Common and make it so that 123-12-1234 is a term.
See
http://framework.zend.com/manual/en/zend.search.lucene.extending.html#zend.search.lucene.extending.analysis
Your other choice is to store the ssn as a Zend_Search_Lucene_Field::Keyword. Since a keyword is not broken up into terms.
http://framework.zend.com/manual/en/zend.search.lucene.html#zend.search.lucene.index-creation.understanding-field-types

Categories