zend search lucene query api: boosting a term - php

I am using zend search lucene, and would like to add boosts to some of my search terms.
The code is already written using the query construction API as follows.
$query->addTerm(new Zend_Search_Lucene_Index_Term($name,'name'), null);
I tried writing
$query->addTerm(new Zend_Search_Lucene_Index_Term($name . "^10", 'name'), null);
But that appears not to work correctly. (I suddenly get no results at all).
This carat syntax is listed in the documentation for query language, but not in the docs for query construction API. I know that in some instances the API doesn't behave quite like plain query language. Is this one of those instances?
Is there a function or parameter that adds boost values to terms?

Try outputting your query by doing something like this:
$term = new Zend_Search_Lucene_Index_Term($name,'name');
$query = new Zend_Search_Lucene_Search_Query_Term($term);
echo $query;
This will allow you to see the query that is being created before you use it to execute a search.

Related

Mongo Doctrine Query Builder Select does not work. Bug?

$dm = $this->get('doctrine.odm.mongodb.document_manager');
$query = $dm->createQueryBuilder('MyBundle:Listing')
->select('title')
->field('coordinates')->geoNear(
(float)$longitude,
(float)$latitude
)->spherical(true);
$classifieds_array = $classifieds->toArray();
$data = array('success'=>true,'classifieds' => $classifieds_array,
'displaymessage' => $classifieds->count(). " Search Results Found");
Even though I am selecting just one field, for my result set, I am getting every thing back in collection along with title. Is this a bug?
NOTE: I commented out the ->field('coordinates')->geoNear((float)$longitude, (float)$latitude)->spherical(true) line and now the select seems to work. This is crazy.
The geoNear command in MongoDB doesn't seem to support filtering result fields, according to the documentation examples. Only a query option is supported to limit matched documents.
In your case, it also looks like mixing up the geoNear() and near() builder methods in Doctrine. Since you're operating on the coordinates field, the appropriate syntax would be near(). geoNear() is a top-level method to tell the builder you wish to use the command, which doesn't require a field name since it uses the one and only geospatial index on the collection.
For usage examples, I would advise looking at the query and builder unit tests in the Doctrine MongoDB library.

Zend Gdata Calendar: setQuery with exact phrase match?

I am having trouble using Zend_Gdata_Calendar to return a subset of Google Calendar events with an exact phrase match.
The reference guide for Zend_Gdata_Books, in the setQuery() section, suggests this should be possible with Zend:
Note that any spaces, quotes or other punctuation in the parameter value must be URL-escaped (Use a plus (+) for a space). To search for an exact phrase, enclose the phrase in quotation marks. For example, to search for books matching the phrase "spy plane", set the q parameter to %22spy+plane%22.
As far as I can tell, Zend_Gdata_Books and Zend_Gdata_Calendar extend the same setQuery() function, so I figure they're equivalent on the Zend end of things.
As for Google, the Calendar query parameters reference says the Calendar API supports the standard Data API query parameters, which in turn says the full-text query string q supports case-insensitive exact phrase searching, just as Zend_Gdata_Books indicates.
I've tried it all these ways:
$gCal = new Zend_Gdata_Calendar();
$query = $gCal->newEventQuery();
$query->setQuery("%22event+text%22"); //no results
$query->setQuery("%22event%20text%22"); //no results
$query->setQuery("\"event text\""); //too many results
$query->setQuery('"event text"'); //too many results
$query->setQuery("event text"); //too many results
I realize the first two didn't work because the string is getting doubly URL-encoded. In the latter cases, I am getting the event I want, but also events including "event" or "text" that I don't want.
Could it be that Google has implemented the full-text query differently for the Calendar API? What kind of dumb things might I be doing on my end to break it?
it looks like you want try it with $query->setQuery('"event text"');
which yields the query string of Query string(19) "?q=%22event+text%22"
where $query->setQuery("event text");
yields the query string of Query string(13) "?q=event+text"
I used :
$gCal = new Zend_Gdata_Calendar();
$query = $gCal->newEventQuery();
$query->setQuery("event text");
Zend_Debug::dump($query->getQueryString(), 'Query');
To test.

sphinx SQL search: excluding items

Have my sphinx search going like so:
$result = $cl->query($_REQUEST['term'], 'myindex');
But I'd like to be able to filter out certain results that don't match a string value, something like:
$result = $cl->query($_REQUEST['term'] . " and somestringcol <> ''", 'myindex');
Is there some proper way to do this using the sphinx PHP API?
You can use SetFilter() to specify a filter on an attribute you defined.
See: http://www.sphinxsearch.com/docs/manual-1.10.html#attributes
As Langdon, metions you can use the SetFilter(), but you may also be able to use the field search operator which is available in the extended search syntax to get a little bit more specific with searching your index rather than attributes associated with it.
$result = $cl->query($_REQUEST['term'] . " #somestringcol -term", 'myindex');
The documentation for sphix provides many good examples: http://sphinxsearch.com/docs/1.10/extended-syntax.html

Match whole field in Lucene

I'm currently indexing a database with lucene. I was thinking of storing the table id in the index, but I can't find a way to retrieve the documents by that field. I guess some pseudo-code will further clarify the question:
document.add("_id", 7, Field.Index.UN_TOKENIZED, Field.Store.YES);
// How can I query the document with _id=7
// without getting the document with _id=17 or _id=71?
EDIT for Zend Lucene:
You will need a Keyword type field in order for it to be searched.
For indexing, use something like:
$doc->addField(Zend_Search_Lucene_Field::Keyword('_id', '7'));
For search, use:
$idTerm = new Zend_Search_Lucene_Index_Term('_id', '7');
$idQuery = new Zend_Search_Lucene_Search_Query_Term($idTerm);
Just to say I've just implemented this successfully on my Zend Lucene search engine. However, after some time troubleshooting I discovered that the field name and field value are the opposite way around to the way shown. To correct the example:
// Fine - no change here
$doc->addField(Zend_Search_Lucene_Field::Keyword('_id', '7'));
// Reversed order of parameters
$idTerm = new Zend_Search_Lucene_Index_Term('7', '_id',);
$idQuery = new Zend_Search_Lucene_Search_Query_Term($idTerm);
I hope that helps someone!

In Zend Lucene, how can I change the field which a query searches?

I am trying to create an "advanced search", where I can let the user search only specific fields of my index. For that, I'm using a boolean query:
$sq1 = Zend_Search_Lucene_Search_QueryParser::parse($field1); // <- provided by user
$sq2 = Zend_Search_Lucene_Search_QueryParser::parse($field2); // <- provided by user
$query = new Zend_Search_Lucene_Search_Query_Boolean();
$query->addSubquery($sq1, true);
$query->addSubquery($sq2, true);
$index->find($query);
How can I specify specify that sq1 will search field 'foo', and sq2 will search field 'bar'?
I feel like I should be parsing the queries differently for the effect (because the user might type in a field name), but the docs only mention the QueryParser for joining user-input queries with API queries.
It seems the simplest way to do this is just to fudge the user input:
$sq1 = Zend_Search_Lucene_Search_QueryParser::parse("foo:($field1)");
$sq2 = Zend_Search_Lucene_Search_QueryParser::parse("bar:($field2)");
$field1 and $field2 should be stripped of parenthesis and colons beforehand to avoid "search injection".
What you want is the query construction API: http://www.zendframework.com/manual/en/zend.search.lucene.query-api.html#zend.search.lucene.queries.multiterm-query
However, I'd recommend that you drop Zend_Search_Lucene altogether. The Java implementation is wonderful, but the PHP implementation is very bad. Regarding what you are trying to do it behaves very buggy, see question 1508748. It's also very, very slow.

Categories