Searching for a term using SQL MATCH which includes spaces - php

This may be a newbie question, as I'm not an expert in SQL. However, couldn't find the answer using Google.
I have a table called record_fields which contains the majority of my system's content, which I want to search in. The content cell is defined as LONGTEXT as it can include extremely long input.
Originally, I used (simplifying the query a bit for clarity sake):
SELECT * FROM record_fields WHERE LOWER(content) LIKE LOWER('%{$keyword}%')
Execution time aside, this query has one major issue. If I search for the term "post" it will return all content which has words like "poster", "posting" and others. I wanted to add a FULLTEXT search.
Now the query looks like this (again, simplified):
SELECT * FROM record_fields WHERE MATCH (content) AGAINST ('{$keyword}')
However, this is still problematic. With MATCH, if my system's users search for the words "Bank of America", for example, all records that either have the word "Bank" and "America" will be returned.
TL;DR - my question is this:
how do I use MATCH to search for exact phrases with space in them?
Any help would be highly appreciated, thanks in advance!

%{keyword}% matches all text sub-strings that include your keyword anywhere in the string. MATCH usually takes all keywords in the match string as individual search terms, and matches against each. You can use boolean mode and use a + symbol before each required keyword. Take a look at the MySQL reference for this.
Edited the answer to reflect Idan's response in not getting the results from the suggested %keyword solution.

You can use Match Against With Boolean Mode and you can put your input string inside '"{$keyword}"'.
Check last example in below link
https://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html
SELECT * FROM record_fields WHERE MATCH (content) AGAINST ('"{$keyword}"' IN BOOLEAN MODE )

Related

How to implement Full Text search in InnoDB?

I have a query,
e.g.
name column have "Rodrigue Dattatray Desilva".
I want to write a query in such a way that,
If I search for 'gtl' and match anywhere in string it should show the result.
I know in PHP I can apply the patch like '%g%t%l%'.
But I want to know MySql way.
Note: I can search for anything, I am just giving above an example.
EDIT:
create table Test(id integer, title varchar(100));
insert into Test(id, title) values(1, "Rodrigue Dattatray Desilva");
select * from Test where title like '%g%t%l%';
Consider the above case. Where "gtl" is string I am trying to search in the title but search string can be anything.
gtl is string where it exists in the current title but not in sequence.
The easy answer is that you need an extra wildcard:
select * from Test where title like '%g%t%l%';
The query you posted does not have a wild card after the 'l', so would only match if the phrase ended with 'l'.
The more complicated answer is that you can also use regular expressions, which give you more power over the search.
The even more complicated answer is that performance of these string matching queries tends to be poor - the wild cards mean that indexes are usually ineffective. If you have a large number of rows in your table, full-text searching is much faster.
You can do the same in Mysql too.
You can use the keyword like in MySql.
% - The percent sign represents zero, one, or multiple characters
_ - The underscore represents a single character

SQL Search for George vs. Georges

I am trying to do a search query with SQL; my page contains an input field who's value is taken and simply concatenated to my SQL statement.
So, Select * FROM users after a search then becomes SELECT * FROM users WHERE company LIKE '%georges brown%'.
It then returns results based on what the user types in; in this case Georges Brown. However, it only finds entries who's companies are exactly typed out as Georges Brown (with an 's').
What I am trying to do is return a result set that not only contains entries with Georges but also George (no 's').
Is there any way to make this search more flexible so that it finds results with Georges and George?
Try using more wildcards around george.
SELECT * FROM users WHERE company LIKE '%george% %brown%'
Try this query:
SELECT *
FROM users
WHERE company LIKE '%george% brown%'
Use SOUNDEX
http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_soundex
You can also remove last 2 characters and get SOUNDEX codes and compare them.
You'll have to look at the documentation of your database system. MySQL for example provides the SOUNDEX function.
Otherwise, what should always work and give you better matching is to only work on upper or lower cased strings. SQL-92 defines the TRIM, UPPER, and LOWER functions. So you'd do something like WHERE UPPER(company) LIKE UPPER('%georges brown%').
In specific cases you can use a wildcard:
WHERE company LIKE '%george% brown%' -- will match `georges` but not `georgeani`
_ is a single-character wildcard, while % is a multi-character wildcard.
But maybe it's better to use another piece of software for indexing, like Sphinx.
It has:
"Flexible text processing. Sphinx indexing features include full support for SBCS and UTF-8 encodings (meaning that effectively all world's languages are supported); stopword removal and optional hit position removal (hitless indexing); morphology and synonym processing through word forms dictionaries and stemmers; exceptions and blended characters; and many more."
It allows you do do smarter searches with partial matches, while providing a more accuracy than soundex, for example.
Probably best to explode out your search string into individual words then find the plural / singular of each of those words. Then do a like for both possibilities for each word.
However for this to be usably efficient on large amounts of data you probably want to run against a table of words linked to each company.
Soundex alone probably isn't much use as too many words are similar (it gives you a 4 character code, the first character being the first character of the word, while the next 3 are a numeric code). Levenshtein is more accurate but MySQL has no method for this built in although php does have a fast function for this (the MySQL functions I found to calculate it were far too slow to be useful on a large search).
What I did for a similar search function was to take the input string and explode it out to words, then converting those words to their singular form (my table of used words just contain singular versions of words). For each word I then found all the used words starting with the same letter and then used levenshtein to get the best match(es). And from this listed out the possible matches. Made it possible to cope with typoes (so it would likely find George if someone entered Goerge), and also to find best matches (ie, if someone searched on 5 words but only 4 were found). Also could come up with a few alternatives if the spelling was miles out.
You may also want to look up Metaphone and Double Metaphone.

measure relevance of a search term returned matched against results

Is there a PHP or MySQL function which will check how relevant a matching field is? Could it review the string and match against a percentage of characters?
For example I am doing a basic search script pulling back results but how can I make the more relevant results appear at the top?
A lot depends on your data and the type of searches that you are expecting. But basically, you could be looking for a fuzzy search. Soundex and Levenshtein distance are two of the many functions that you can use for string matches
http://php.net/manual/en/function.levenshtein.php
Well, you are asking a few complicated questions here. Mostly, I think you are looking for information retrieval techniques. Some answers are all over Stack OVerflow.
What tried and true algorithms for suggesting related articles are out there? is great I think
You might want to use the levenshtein distance if you are just looking for how closely a keyword matches an existing keyword.
I tried :P
Mysql has a function MATCH
You can youse it like
SELECT * FROM `table` WHERE MATCH(content) AGAINST('search text')
So it will look within content how relevancy it is.
But you need to index field content to FULLTEXT which requires an table type "MYISAM".
The output will automaticly sorted ascending.
hope this helps

How do I include plurals but exclude singulars? [duplicate]

I am building a site with a requirement to include plural words but exclude singlular words, as well as include longer phrases but exclude shorter phrases found within it.
For example:
a search for "Breads" should return results with 'breads' within it, but not 'bread' or 'read'.
a search for "Paperback book" should return results with 'paperback book' within it, but not 'paperback' or 'book'.
The query I have tried is:
SELECT * FROM table WHERE (field LIKE '%breads%') AND (field NOT LIKE '%bread%')
...which clearly returned no results, even though there are records with 'breads' and 'bread' in it.
I understand why this query is failing (I'm telling it to both include and exclude the same strings) but I cannot think of the correct logic to apply to the code to get it working.
Searching for %breads% would NEVER return bread or read, as the 's' is a required character for the match. So just eliminate the and clause:
SELECT ... WHERE (field LIKE '%breads%')
SELECT ... WHERE (field LIKE '%paperback book%');
You should consider using FULL TEXT SEARCH.
This will solve your Bread/read issue.
I believe use of wildcards here isn't useful. Lets say you are using '%read%', now this would also return bread, breads etc, which is why I recommended Full Text Search
With MySQL you can use REGEXP instead of like which would give you better control over your query...
SELECT * FROM table WHERE field REGEXP '\s+read\s+'
That would at least enforce word boundaries around your query and gives you much better control over your matching - with the downside of a performance hit though.

php my sql search without "like" cause?

i am using php and mysql...
i have application in which user enter any text and i want to fiind related data from database without using "LIKE" cause in my mysql query.
is there any possible way to search these string in database.
or any approach in mysql to do this....
Thanks in advance.
You can also check out MATCH clause.
You can use REGEXP, when user put single word you put WHERE field REGEXP '.*TEXT.*' in your query, regex is cool because you can allow user to put regular expression in search field.
If you don't want to use LIKE, and don't give a reason why (it seems fine for everyone else) then here is a solution that gets you araound it. (But it might not be the best real-world option...)
Whenever anything is added to the database that you want to be searched, take each word and break it into every possible combination of 1 or more consecutive letters.
E.g. for stack:
s, t, a, c, k, st, ta, ac, ck, sta, tac, ack, stac, tack, stack
Insert each of these into a table with an identifier that links to the original data.
Then you can match any search query against this list of words eactly (for full and partial matches). If your user is searching for multiple keywords, you split them in the front and and search for each, looking for matches to the same identifier.

Categories