I'm using PDO to execute a MATCH AGAINST query.
The following returns nothing:
SELECT title, author, isbn, MATCH(title, isbn) AGAINST (:term) AS score
FROM books
WHERE MATCH(title, isbn) AGAINST (:term)
ORDER BY score DESC LIMIT 0,10
Where as this returns perfectly:
SELECT title, author, isbn, MATCH(title, isbn) AGAINST (:term) AS score
FROM books
WHERE MATCH(title, isbn) AGAINST (:term IN BOOLEAN MODE)
ORDER BY score DESC LIMIT 0,10
Could anyone tell me why IN BOOLEAN MODE is making such a difference, and whether or not I should be using it in my query?
The second query is running as a "natural language search" as that is the default when no natural language search type is specified. This type of search filters additionally filters out words that are present in 50% or more of the rows automatically.
"IN BOOLEAN MODE" does do this additional filtering, and thus, may return matches if you are searching on a common term.
Whether or not you should be using a boolean search depends on what the specifics of your situation and cannot be determined without more information. However, some considerations may include, size of the input data set vs how large of a matching dataset you want returned and whether you want to return results for words that occur frequently.
(Ref: http://dev.mysql.com/doc/refman/5.1/en/fulltext-search.html)
Related
I've built an internal DB / Search Engine for art creatives. I'm trying to create a search criterion where you can query one column in the database and also search several columns in the database for a phrase search using FullText Search. The example of a search query might be: November {and} Black Friday. November would search for creatives matching the created_for column and the black friday would search headline, subheadline and additional_text columns with a fulltext search. Any ideas of how to accomplish this would be really helpful!
SELECT
(SELECT * FROM headlines WHERE created_for = '$searchString' AND image_slug <> '')
(SELECT *, MATCH(headline) AGAINST('$fullText' IN BOOLEAN MODE) AS MultiScore, MATCH(subheadline, additional_text) AGAINST('$fullText' IN BOOLEAN MODE) AS MultiSecondScore
FROM `headlines`
WHERE MATCH(headline, subheadline, additional_text) AGAINST('$fullText' IN BOOLEAN MODE))
I've tried adding a UNION statement before the second Select statement, but I get an error message saying the columns don't match. Not sure what I've got wrong here, but thanks in advance for your help!
Use AND in the WHERE clause.
SELECT *, MATCH(headline) AGAINST('$fullText' IN BOOLEAN MODE) AS MultiScore, MATCH(subheadline, additional_text) AGAINST('$fullText' IN BOOLEAN MODE) AS MultiSecondScore
FROM headlines
WHERE created_for = '$searchString'
AND image_slug <> ''
AND MATCH(headline, subheadline, additional_text) AGAINST('$fullText' IN BOOLEAN MODE)
UNION would get results that match either of the criteria, not both of them. And when you use UNION, both subqueries have to return the same number of columns -- you would have to add extra columns to the first query to match the MultiScore and MultiSecondScore columns of the first query.
I am working on the search feature on my website and would like to improve it a little bit.
My website is a tube website.
I used to work with mysql 'LIKE' statement but realized MATCH() AGAINST() was way more performant.
But I still have a problem with small and partial word.
Let's say there is a title in the video database called: 'Funny video about monkeys - Asia - lol'
There are the different select query with what would be returned (using phpmyadmin to avoid error that would be related by wrong php code)
SELECT video.*, MATCH (title) AGAINST ('lol' IN BOOLEAN MODE) AS relevance FROM video WHERE active = '1' AND MATCH (title) AGAINST ('lol' IN BOOLEAN MODE) ORDER BY relevance DESC LIMIT 18
returns
nothing
SELECT video.*, MATCH (title) AGAINST ('monkey' IN BOOLEAN MODE) AS relevance FROM video WHERE active = '1' AND MATCH (title) AGAINST ('monkey' IN BOOLEAN MODE) ORDER BY relevance DESC LIMIT 18
returns
nothing
SELECT video.*, MATCH (title) AGAINST ('asia' IN BOOLEAN MODE) AS relevance FROM video WHERE active = '1' AND MATCH (title) AGAINST ('asia' IN BOOLEAN MODE) ORDER BY relevance DESC LIMIT 18
returns
the video in the database
SELECT video.*, MATCH (title) AGAINST ('asian monkey' IN BOOLEAN MODE) AS relevance FROM video WHERE active = '1' AND MATCH (title) AGAINST ('asian monkey' IN BOOLEAN MODE) ORDER BY relevance DESC LIMIT 18
returns
nothing
So if I am right, the video is returned only if it contains the exact word, and if the word is at least 4 chars.
But the strange thing is that 'ft min word len' is set to 2 so event the query 'lol' should return the video. Also, the query states 'IN BOOLEAN MODE' so I guess it should return even commons words.
As from the partial word like 'monkey' instead of 'monkeys', I have no idea how to go around this as using 'LIKE %...% OR LIKE %...%' won't be a solution because I need to get the relevance.
Well thank you for reading and helping me.
I would like to tweak the results returned by this Fulltext-query:
$STH = $DBH->prepare('SELECT *,
MATCH (title,title_under,subject) AGAINST (:query) AS score
FROM articles
WHERE MATCH(title,title_under,subject) AGAINST(:query IN BOOLEAN MODE)
order by score desc');
Is there a way to return the score calculated by mysql so that I can run my own conditions for adding/subtracting points before parsing the results?
Yes, the "MATCH() AGAINST() AS score" in your SELECT-statement will already do just that, it'll return the score calculated by MySQL.
I notice you executing the FT search IN BOOLEAN MODE in the WHERE-clause, but not in the SELECT however.
I have the following query which is used in order to do an auto-complete of a search box:
SELECT *, MATCH (screen_name, name) AGAINST ('+query*' IN BOOLEAN MODE) AS SCORE
FROM users
WHERE MATCH (screen_name, name) AGAINST ('+query*' IN BOOLEAN MODE)
ORDER BY SCORE DESC LIMIT 3
I also have a FULL TEXT index on screen_name & name (together). When this table was relatively small (50k) this worked great. Now the table is ~200k and it takes seconds(!) to complete each query. I'm using MySql MyISAM. Is this reasonable? What directions might I check in order to improve this as surely it doesn't satisfy the needs of an auto-complete query.
MYSQL Match against is really slow, you should look into alternatives like Sphinx Search Server.
I've been tearing my hair as to why this fails I have the following code
$query = "
SELECT DISTINCT title, caption, message, url, MATCH(title, caption, message, url) AGAINST ('$searchstring' ) AS score FROM news WHERE (valid = 1) AND MATCH(title, caption, message, url) AGAINST ('$searchstring' ) UNION ALL
SELECT DISTINCT title, caption, message, url, MATCH(title, caption, message, url) AGAINST ('$searchstring' ) AS score FROM paged WHERE (valid = 1) AND MATCH(title, caption, message, url) AGAINST ('$searchstring' ) ORDER BY score DESC";
I'm able to get search results from the paged table but not from the news table
My guess as to what the problem is: stopwords.
From the documentation:
The stopword list applies. In addition, words that are present in 50% or more of the rows are considered common and do not match.
If paged didn't meet the criteria but news did then you'd get results for one and not the other.
Are both tables MyISAM? Or is news InnoDB?
Use this query to find out if you don't know.
select table_name
, engine
from information_schema.tables
where table_name in('news','paged');
Because InnoDB type tables don't support fulltext.
GOT IT AT LAST, Thanks Guys.... HERE IS cause of the problem
If a word is present in more than 50%
of the rows it will have a weight of
zero. This has advantages on large
datasets, but can make testing
difficult on small ones.
A natural language search interprets
the search string as a phrase in
natural human language (a phrase in
free text). There are no special
operators. The stopword list applies.
In addition, words that are present in
50% or more of the rows are considered
common and do not match. Full-text
searches are natural language searches
if the IN NATURAL LANGUAGE MODE
modifier is given or if no modifier is
given.
one the one hand the table had only one entry thus the 50% benchmark was overshot
even when I duplicated the entry 5 times the 50% benchmark was still an issue and relevance 0, so I added the modifier e.g.
SELECT * FROM table WHERE MATCH(col1,col2) AGAINST('search_term' IN BOOLEAN MODE)
This is my first time posting on stackoverflow...didn't expect to get responses so fast,
Thanks