Autocomplete SQL Query suggestions (Ajax+PHP) - php

I have a question regarding SQL best practices when formulating a query for use in an Autocomplete form (jquery Axax + PHP).
Let us assume the following:
I have a database with the titles of books
Some books have titles without a definite article ("The" or "A") such as "Life of Pi"
Some books have titles with a definite article ("The" or "A") such as "The Catcher in the Rye"
As a result, users will input the title of the book either using "The" at the beginning or simply omitting the "The" and start the query without any definite article.
Three possible queries seem to exist:
SELECT 'title' FROM 'books' WHERE 'title' LIKE '%$string'
or
SELECT 'title' FROM 'books' WHERE 'title' LIKE '$string%'
or
SELECT 'title' FROM 'books' WHERE 'title' LIKE '%$string%'
When using the first query method (where the % is before the string), it is difficult to get any results, since the wildcard before the string seems to behave erroneously.
When using the second query, it seems to favor exact matches using "The" before a title. Thus, a user searching for "The Catcher in the Rye" will find the book, but a user searching for "Catcher in the Rye" will not.
The last result is the best one, since it has a wildcard before and after the string. However, it also gives the longest auto-complete list. The user will have to type a few letters to narrow down the search result.
Any ideas on implementing a more efficient query? Or is the third option the best one (seeing as it is not feasible to separate the definite article in the title of a book?
Thanks in advance,

You can do a search using Regular Expressions (query result comes quickly)
and do not forget to add limitation to your results.
a small example
SELECT title FROM books WHERE title REGEXP '$string' LIMIT 20
or you can use word boundaries
SELECT title FROM books WHERE title REGEXP '[[:<:]]$string[[:>:]]' LIMIT 20
see the documents http://dev.mysql.com/doc/refman/5.5/en/regexp.html

$query = mysqi_query("SELECT title FROM books WHERE title REGEXP '$string'");
if($query->num_rows() == 0) {
//First remove all the stop words like for, the, of, a from the search string.
$stopWords = array('/\bfor\b/i', '/\bthe\b/i', '/\bto\b/i', '/\bof\b/i','/\ba\b/i');
$string = preg_replace($stopWords, "", $string);
//Then, use
mysqli_query("SELECT title FROM books WHERE title REGEXP '$string'");
}

I would suggest using the third method with wildcards on either side of the string. If you are worried about the size of the returned result set, perhaps limit the results to a certain number, and as the user types the list will naturally get smaller and more specific.

you may also consider allowing searches for 'Catcher Rye' that should still match.
in this case - you would tokenize each word in the title as well as the words entered by the user and find the best matches.
otherwise only autocomplete after say 4 or more characters have been entered, and use option 3.

If you're worried about the quantity of suggestions, can you modify the change event to only retrieve suggestions after they have typed some minimum number of characters in the field?

Related

Advanced search term in php and MySql

I have a MySql table contain a title field.
Suppose a user enter a term in an input textbox.
Now I want to select rows whose title field has one of the following states:
1) its title is exactly the same as term : "sugar">>"sugar"
2) one of the words of its title is like the term : "pretty flower" >> "rose flower"
3) its title is from the same Word family as that term (term is root word of those) : "biology" >> "biography, biodegradable, symbiotic"
I'm using laravel. if can suggest any solution for that , Would be great
This was an easy solution until you mentioned word family...
However, this is something that is still achievable. It's about how you approach it.
You'll start by using the like condition in MySQL.
You can read more about it on the MySQL Website relating to Pattern Matching
Here's an example of that:
SELECT * FROM my_table WHERE field1 LIKE '%searchTerm%';
In terms of getting results by "word family", you may want to consider adding tags to the result.
Add a tags field in the table and add an array of tags that would relate to the "word family".
Your query would then look something like this:
SELECT * FROM my_table WHERE field1 LIKE '%searchTerm%' OR field2 LIKE '%searchTerm%';
You'll need to loop through the array of tags to find what matches, if at all.
Just my approach to get started.
In terms of how to do this is Laravel, your query may look something like this, according to their documentation
$results = DB::table('table1')
->where('myfield', 'like', '%searchterm%')
->get();
I surely hope this puts you in the right direction.

Searching for a term using SQL MATCH which includes spaces

This may be a newbie question, as I'm not an expert in SQL. However, couldn't find the answer using Google.
I have a table called record_fields which contains the majority of my system's content, which I want to search in. The content cell is defined as LONGTEXT as it can include extremely long input.
Originally, I used (simplifying the query a bit for clarity sake):
SELECT * FROM record_fields WHERE LOWER(content) LIKE LOWER('%{$keyword}%')
Execution time aside, this query has one major issue. If I search for the term "post" it will return all content which has words like "poster", "posting" and others. I wanted to add a FULLTEXT search.
Now the query looks like this (again, simplified):
SELECT * FROM record_fields WHERE MATCH (content) AGAINST ('{$keyword}')
However, this is still problematic. With MATCH, if my system's users search for the words "Bank of America", for example, all records that either have the word "Bank" and "America" will be returned.
TL;DR - my question is this:
how do I use MATCH to search for exact phrases with space in them?
Any help would be highly appreciated, thanks in advance!
%{keyword}% matches all text sub-strings that include your keyword anywhere in the string. MATCH usually takes all keywords in the match string as individual search terms, and matches against each. You can use boolean mode and use a + symbol before each required keyword. Take a look at the MySQL reference for this.
Edited the answer to reflect Idan's response in not getting the results from the suggested %keyword solution.
You can use Match Against With Boolean Mode and you can put your input string inside '"{$keyword}"'.
Check last example in below link
https://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html
SELECT * FROM record_fields WHERE MATCH (content) AGAINST ('"{$keyword}"' IN BOOLEAN MODE )

How do I include plurals but exclude singulars? [duplicate]

I am building a site with a requirement to include plural words but exclude singlular words, as well as include longer phrases but exclude shorter phrases found within it.
For example:
a search for "Breads" should return results with 'breads' within it, but not 'bread' or 'read'.
a search for "Paperback book" should return results with 'paperback book' within it, but not 'paperback' or 'book'.
The query I have tried is:
SELECT * FROM table WHERE (field LIKE '%breads%') AND (field NOT LIKE '%bread%')
...which clearly returned no results, even though there are records with 'breads' and 'bread' in it.
I understand why this query is failing (I'm telling it to both include and exclude the same strings) but I cannot think of the correct logic to apply to the code to get it working.
Searching for %breads% would NEVER return bread or read, as the 's' is a required character for the match. So just eliminate the and clause:
SELECT ... WHERE (field LIKE '%breads%')
SELECT ... WHERE (field LIKE '%paperback book%');
You should consider using FULL TEXT SEARCH.
This will solve your Bread/read issue.
I believe use of wildcards here isn't useful. Lets say you are using '%read%', now this would also return bread, breads etc, which is why I recommended Full Text Search
With MySQL you can use REGEXP instead of like which would give you better control over your query...
SELECT * FROM table WHERE field REGEXP '\s+read\s+'
That would at least enforce word boundaries around your query and gives you much better control over your matching - with the downside of a performance hit though.

Database Search minus "The" prefix

could someone please point me in the right direction, I currently have a searchable database and ran into the problem of searching by title.
If the title begins with "The" then obviously the title will be in the 'T' section, what is a good way to avoid "The" being searched ? Should i concat two fields to display the title but search by only the second title ignoring the prefix. or is there another way to do this? Advice or direction would be great. thanks.
A few choices:
a) Store the title in "Library" format, which means you process the title and store it as
Scarlet Pimpernel, The
Tale of Two Cities, A
b) Store the original unchanged title for display purposes, and add a new "library_title" field to store the processed version from a).
c) Add a new field to store the articles, and the bare title in title field. For display, you'd concatenate the two fields, for searching you'd just look in the title field.
I believe the best approach is to use full-text search, with 'the' in the stopwords list. That would solve the search problem (i.e., 'the' on search phrases would be ignored).
However, if you are ordering the results by title, a title starting with 'The' would still be sorted, "in the 'T' section", as you put it. To solve that, there are several possible approaches. Here are some of them:
Separating the fields, the way you said on the quesiton
Having a separate field with the number of chars to be ignored from the beginning when sorting
Replacing initial 'The's for sorting
Among others...
If you are using mysql, you could use a str_replace function to remove "The" from your query, or if you are using PHP or Ruby or another language you can just sanitize your query before sending to the database server.
Create three columns in the database
1) TitlePrefix
2) Title
3) TitlePostfix
Code such that you have 4 methods like
searchTitleOnly(testToSearch) // search only title column
searchTitleWithPrefixAndPostfix(testToSearch)//concat all the three columns and search
searchTitlePrefix(testToSearch) // search title prefix only
searchTitlePostfix(testToSearch) // search title postfix only
Try looking into some sql functions like LTRIM, RTRIM etc and use these functions on a temp column which has exact same data. Modify the data by using LTRIM, RTRIM by dropping whichever words u please. Then perform the search on the modified column and return the entire row as the result!

Need a PHP MySQL script to search for keywords in a database

I need to implement a search option for user comments that are stored in a MySQL database. I would optimally like it to work in a similar manner to a standard web page search engine, but I am trying to avoid the large scale solutions. I'd like to just get a feel for the queries that would give me decent results. Any suggestions? Thanks.
It's possible to create a full indexing solution with some straightforward steps. You could create a table that maps words to each post, then when you search for some words find all posts that match.
Here's a short algorithm:
When a comment is posted, convert the string to lowercase and split it into words (split on spaces, and optionally dashes/punctuation).
In a "words" table store each word with an ID, if it's not already in the table. (Here you might wish to ignore common words like 'the' or 'for'.)
In an "indexedwords" table map the IDs of the words you just inserted to the post ID of the comment (or article if that is what you want to return).
When searching, split the search term on words and find all posts that contain each of the words. (Again here you might want to ignore common words.)
Order the results by number of occurrences. If the results must contain all the words you'd need to find the union of your different arrays of posts.
As an entry point, you can use MySQL LIKE queries.
For example if you have a table 'comments' with a column named 'comment', and you want to find all comments that contain the word 'red', use:
SELECT comment FROM comments WHERE comment LIKE '% red %';
Please note that fulltext searches can be slow, so if your database is very large or if you run this query a lot, you will want to find an optimized solution, such as Sphinx (http://sphinxsearch.com).

Categories