I'm trying to use MySQL's FTS to search through indexed content to look for certain keywords. For what I'm trying to make, it needs either one or more of the keywords in the text, and the keywords must be exact word matches. However, it doesn't matter if the keywords is in the middle of another word, for example, when searching for "STACK", it should match:
Hi, I have a stack of overflows
Stacked against the wall
These bookcases are overstacked completely
I was using the following method before:
SELECT ... FROM ... WHERE text LIKE '%keyword1%' OR LIKE '%keyword2%' OR LIKE '%keyword3%'
This would return any text that would contain any of the keywords. However, this began to slow down pretty much everything, because most of the indexed content is big (stored in blob) and I have over 500 of those rows to index through. As Like is not using any indexing with this method, I tried converting to FTS using the following:
SELECT ... FROM ... WHERE MATCH(text) AGAINST ('+keyword1 +keyword2 +keyword3' IN BOOLEAN MODE)
This worked good with single keywords, but when entering multiple keywords, this fails because FTS with the + operand NEEDS to find a match with the given words. But without these keywords, the FTS matches fuzzy results, not exact results. I end up with content being missed that most definitly contains the keywords.
What can I use to get all content that contain either one or more of the axact keywords?
try something like this:
SELECT * FROM table MATCH (text) AGAINST ('+"keyword1" +"keyword2"' IN BOOLEAN MODE)
It sounds like you are looking for something like the following:
SELECT ... FROM ...
WHERE MATCH(text) AGAINST ('+keyword1' IN BOOLEAN MODE)
OR MATCH(text) AGAINST ('+keyword2' IN BOOLEAN MODE)
OR MATCH(text) AGAINST ('+keyword3' IN BOOLEAN MODE)
Related
I have a query,
e.g.
name column have "Rodrigue Dattatray Desilva".
I want to write a query in such a way that,
If I search for 'gtl' and match anywhere in string it should show the result.
I know in PHP I can apply the patch like '%g%t%l%'.
But I want to know MySql way.
Note: I can search for anything, I am just giving above an example.
EDIT:
create table Test(id integer, title varchar(100));
insert into Test(id, title) values(1, "Rodrigue Dattatray Desilva");
select * from Test where title like '%g%t%l%';
Consider the above case. Where "gtl" is string I am trying to search in the title but search string can be anything.
gtl is string where it exists in the current title but not in sequence.
The easy answer is that you need an extra wildcard:
select * from Test where title like '%g%t%l%';
The query you posted does not have a wild card after the 'l', so would only match if the phrase ended with 'l'.
The more complicated answer is that you can also use regular expressions, which give you more power over the search.
The even more complicated answer is that performance of these string matching queries tends to be poor - the wild cards mean that indexes are usually ineffective. If you have a large number of rows in your table, full-text searching is much faster.
You can do the same in Mysql too.
You can use the keyword like in MySql.
% - The percent sign represents zero, one, or multiple characters
_ - The underscore represents a single character
I inserted in my web browser and it works great, but it shows me messy results, I would like to show me at the top where over the word or words in the search is repeated.
I looked online tutorials but I can not do it and always come messy, do not understand why.
Right now I have it like this:
$sql="SELECT art,tit,tem,cred,info,que,ano,url
FROM contenido
WHERE MATCH (art,tit,tem,cred,info)
AGAINST ('" .$busqueda. "' IN BOOLEAN MODE)
ORDER BY id DESC";
There is not much information on the Internet about refine or optimize searches for Mysql FULLTEXT. See if experts come through here and so we all learn.
How could refine your search? Thank you.
I think the issue is that you're sorting by the id.
The fulltext sorts by the match score it calculates, showing stronger matches first. When you apply ORDER BY id DESC, you loose this sort-by-match ordering.
You can see the actual score in your result set if you want by:
SELECT art,tit,tem,cred,info,que,ano,url,
MATCH (art,tit,tem,cred,info)
AGAINST ('your term' IN BOOLEAN MODE) AS score
FROM contenido
WHERE MATCH (art,tit,tem,cred,info)
AGAINST ('your term' IN BOOLEAN MODE)
ORDER BY id DESC
BTW: Use prepared statements for the 'your term' portion.
If your search string has spaces but each term matter, you need to treat them as separate pieces. So if it's important to have BOTH "Mercedes" AND "Benz":
Don't: AGAINST ('Mercedes Benz' IN BOOLEAN MODE) <--- This means either Mercedes or Benz
Do: AGAINST ('+Mercedes +Benz' IN BOOLEAN MODE)
If you want to have anything that must have the first term, but optionally the second term (ranking higher when both found) do:
AGAINST ('+Mercedes Benz' IN BOOLEAN MODE)
Here's a long list of combinations: https://dev.mysql.com/doc/refman/5.6/en/fulltext-boolean.html
AND dont forget, get rid of the ORDER BY id DESC. I think you're final query should look something like this for "Mercedes Benz"
SELECT art,tit,tem,cred,info,que,ano,url
FROM contenido
WHERE MATCH (art,tit,tem,cred,info)
AGAINST ('+Mercedes Benz' IN BOOLEAN MODE);
Yep, freetext in MySQL has a lot of quirks, but play around, you'll get the hang of it.
I'm having an issue with my fulltext search query and I've pretty much gone through all the forums and threads I could find but I'm still having this issue.
I am using the MySQL Fulltext Boolean Mode search to return matching rows based on two columns (artist name and track title). The user can enter any phrase they wish into the search bar, and the results are supposed to be only rows that contain ALL parts of the search query in EITHER of the columns.
This is the query I have so far, and it works with most queries but for some it return results too loosely and I'm not sure why.
SELECT * FROM tracks WHERE MATCH(artist, title) AGAINST('+paul +van +dyk ' IN BOOLEAN MODE)
This query however, returns rows containing Paul, without the 'Van' or 'Dyk'. Again, I want the query to return only rows that contain ALL of the keywords in EITHER the Artist or Track Name column.
Thanks in advance
To enhance sorting of the results in boolean mode you can use the following:
SELECT column_names, MATCH (text) AGAINST ('word1 word2 word3')
AS col1 FROM table1
WHERE MATCH (text) AGAINST ('+word1 +word2 +word3' in boolean mode)
order by col1 desc;
Using the first MATCH() we get the score in non-boolean search mode (more distinctive). The second MATCH() ensures we really get back only the results we want (with all 3 words).
So your query will become:
SELECT *, MATCH (artist, title) AGAINST ('paul van dyk')
AS score FROM tracks
WHERE MATCH (artist, title)
AGAINST ('+paul +van +dyk' in boolean mode)
order by score desc;
Hopefully; you will get better results now.
If it works or do not work; please let me know.
Piggybacking off of #Avidan's answer. These are the operations that can be performed on full-text searches to help create the query you desire.
The following examples demonstrate some search strings that use
boolean full-text operators:
'apple banana'
Find rows that contain at least one of the two words.
'+apple +juice'
Find rows that contain both words.
'+apple macintosh'
Find rows that contain the word “apple”, but rank rows higher if they
also contain “macintosh”.
'+apple -macintosh'
Find rows that contain the word “apple” but not “macintosh”.
'+apple ~macintosh'
Find rows that contain the word “apple”, but if the row also contains
the word “macintosh”, rate it lower than if row does not. This is
“softer” than a search for '+apple -macintosh', for which the presence
of “macintosh” causes the row not to be returned at all.
'+apple +(>turnover
Find rows that contain the words “apple” and “turnover”, or “apple”
and “strudel” (in any order), but rank “apple turnover” higher than
“apple strudel”.
'apple*'
Find rows that contain words such as “apple”, “apples”, “applesauce”,
or “applet”.
'"some words"'
Docs: https://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html
When I enter word (ex: freshers) I want fresher and freshers records both currently I am getting only freshers records. My code is like this:
$search='freshers';
$qry=mysql_query("select count(*) from jobs where job_title like '%$search%' or MATCH(job_title)
AGAINST('$search' IN BOOLEAN MODE)");
when the search word is freshers I m getting count as 1200. When the search word is fresher I am again getting count as 2000.
How to get almost same count when I enter either freshers or fresher.
You cannot get precisely the same matching-record count from any MySQL technology with a search term that's either singular or plural.
MySQL doesn't have the smarts to know that freshers is the plural of fresher or children is the plural of child. If you want to do this with MySQL you'll have to start your search with the singular form of the word for which you want the plural.
Neither does MySQL know that mice is the plural of mouse.
If you need automated plural/singular functionality you may want to investigate Lucene or some other natural language search tech. The name of the capability you seek is "stemming."
But you can use FULLTEXT search with terms with trailing asterisks. For example, 'fresher*' matches fresher, freshers, and even fresherola. This will extend a search from singular to plural. It will not work the other way around. It
select count(*)
from jobs
where MATCH(job_title) AGAINST('fresher*' IN BOOLEAN MODE)
There are some other modifying characters for boolean mode search terms. They are mentioned here:
http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html
Pro tip: column LIKE '%searchterm%' is probably the slowest way MySQL offers to search a column. It is guaranteed to scan the whole table.
Pro tip: FULLTEXT search is inherently a bit fuzzy. Expecting crisp record counts from it is probably a path to confusion.
try this
$search='fresher';
$qry=mysql_query("select count(*) from jobs where job_title like '$search%'
I have the following structure of a table [id - title - tag - ..]
I want to achieve the following:
If there is a record in table with title "I love my job and it is my hobby"
If a query is submitted having two words from the sentence then this sentence should be selected. E.g. query "love hobby". It should give me the above title and not for example "I love my job". At least the sentence with more words matching the query keywords first then the less ones later.
how can I do this search on the title column of my table?
I apologize if explanation not clear...more than happy to help clarify.
Thank you all
Try this :
SELECT title FROM your_table WHERE title LIKE '%love%' AND title LIKE
'%hobby%'
Look into mysql's built in full text search capabilities. In boolean mode, you could transform your query to +love +hobby and have results returned without full table scans. Be aware that this only works with myisam tables, might want to move the indexed data out of the main tables since myisam doesn't support things like foreign keys or transactions.
For more advanced free text indexing you could try sphinx (have mysql look-and-feel interface too) or solr.
If you're using MyISAM or innoDB, you can use the MySQL fulltext search:
SELECT * FROM table_name WHERE MATCH (title) AGAINST ('love hobby' IN BOOLEAN MODE);
It'll also search for individual words as well.
Read this: https://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html
You can also use MySQL REGEXP
SELECT title FROM table WHERE title REGEXP ' love .+ hobby';
If you have it as a single string then try:
SELECT title FROM table WHERE title REGEXP REPLACE('love hobby', ' ', '.+ ');