I have a table called users which contains the columns firstname and lastname.
I am struggling on how to structure the WHERE clause to return results on matched first name, last name, first name plus a space and the last name, and last name plus a comma and the first name. For instance, "John", "Doe", "John Doe", and "Doe, John". It should also work for partial patches (i.e. "John D"). I am thinking of something like the following (substitute ? with the search phrase).
SELECT * FROM people WHERE
firstname LIKE ?
OR CONCAT_WS(" ",firstname,lastname) LIKE ?
OR lastname LIKE ?
OR CONCAT_WS(", ",lastname,firstname) LIKE ?
There is a little flaw with this approach as searching on "John Doe" will also return "John Enders", "John Flaggens", and "John Goodmen", so I will need to add some conditional to just return results when both first and last name match when both are given.
There is also a big flaw with this approach as I have functions in my WHERE clause which prevent the use of indexes and result in significantly reduce performance.
Can I effectively do this using just SQL without using functions in my WHERE clause, or should I user server code (ie PHP) to parse the given search phrase for spaces and commas and create a dynamic SQL query based on the results?
You ask if you can efficiently do this within MySQL, given your schema design without using FULLTEXT search the simple answer is no, you can't do this efficiently.
You could create some wacky query using LIKE / STRCMP and / or string functions. If you were to take this approach it will be much better to write some application logic to create the query inline rather than trying to write one query that can handle everything. Either way it’s not likely be truly efficient
MySQL 5.6 has the ability to perform FULLTEXT searches within INNODB. If you’re using a lower version you can only do this with MyISAM tables – there are many reasons why MyISAM may not be a good idea.
Another approach is to look at a real search solution such as Lucene, Sphinx or Xapian
Have you tried regular expressions http://dev.mysql.com/doc/refman/5.1/en/regexp.html
Related
Well i currently want to do a search engine with SQL and PHP. I crrently use the following query:
SELECT * FROM info WHERE name LIKE '%$q%' LIMIT 10
But i want to select the info with 'name' that start with $q, not the ones that cointain $q.
Simply remove the first wildcard (%):
SELECT * FROM info WHERE name LIKE 'X%' LIMIT 10
If you are wanting to search fields that have multiple names and you want to search for a name that might be in the middle of the field, then it might be better to use Full Text Search (FTS). For example, if the name field in the OP refers to a full name and could contain "John Doe" and you want to search for "Doe", then FTS is probably going to be better (more accurate and faster) than using a LIKE operator. With FTS, the individual words in text are indexed and so searches can be very fast, plus it allows for more complex searches such as the ability to find two words (or names) in a field that are not adjacent (to pick one example at random).
The specific database vendor is not mentioned in the OP, but if FTS is supported, then it might be a good choice for what you are attempting. Some FTS information for SQL Server and for MySQL is easily found with a search.
I am building a site with a requirement to include plural words but exclude singlular words, as well as include longer phrases but exclude shorter phrases found within it.
For example:
a search for "Breads" should return results with 'breads' within it, but not 'bread' or 'read'.
a search for "Paperback book" should return results with 'paperback book' within it, but not 'paperback' or 'book'.
The query I have tried is:
SELECT * FROM table WHERE (field LIKE '%breads%') AND (field NOT LIKE '%bread%')
...which clearly returned no results, even though there are records with 'breads' and 'bread' in it.
I understand why this query is failing (I'm telling it to both include and exclude the same strings) but I cannot think of the correct logic to apply to the code to get it working.
Searching for %breads% would NEVER return bread or read, as the 's' is a required character for the match. So just eliminate the and clause:
SELECT ... WHERE (field LIKE '%breads%')
SELECT ... WHERE (field LIKE '%paperback book%');
You should consider using FULL TEXT SEARCH.
This will solve your Bread/read issue.
I believe use of wildcards here isn't useful. Lets say you are using '%read%', now this would also return bread, breads etc, which is why I recommended Full Text Search
With MySQL you can use REGEXP instead of like which would give you better control over your query...
SELECT * FROM table WHERE field REGEXP '\s+read\s+'
That would at least enforce word boundaries around your query and gives you much better control over your matching - with the downside of a performance hit though.
I have a table which contains contact information. This table has 1 column relevant to names. It looks like this:
Sort Name:
Doe, John
Clinton, Bill
Dooby Doo, Scooby
Sadly, there are no first name / last name columns, and they can't be added.
If a user enters the search term "john doe", is there any way I can get mysql to return the column for "Doe, John" as my result?
Edit: As a follow up to the answer by RHSeeger, what would I do to make
john doe
into
$name(0=>'john', 1=>'doe')
using php
Thanks for the help so far
My thought would be:
SELECT * FROM tablename
WHERE LOWER(sort_name) LIKE '%name1%'
...
AND LOWER(sort_name) LIKE '%nameN%'
where <name1> ... <nameN> are the individual, lowercased "words" (split on space, comma?) in the name the user requested.
Edit: It's worth noting that, for a reasonably large sized data set, searches like this are going to be slow. If you're going to have lots of data, you could consider a search engine (like SOLR) that you use to search for things, and using MySQL just for lookup by ID.
In your application, split the search term at the last space, change the order and add a comma in between.
Look in to the sql "like" operator. For example
select * from TABLENAME
where sort_name like '%doe'
You would need to do a little manipulation to the search terms on the application side and possibly supply other conditional clauses in your sql statement with other like clauses.
I would begin by parsing the search into separable words "john" and "doe" then then write a search using the LIKE functionality for both terms.
EDIT:
To answer your question from a answer below... how do parse "John Doe" into "John" and "Doe"
I would write a MySQL stored procedure to separate them into a result set containing "John" and "Doe" and then use the like versus that column name
have a full-text index on the names column and then do this,
select * from names where match(name) against ('john doe');
I have a table that lists people and all their contact info. I want for users to be able to perform an intelligent search on the table by simply typing in some stuff and getting back results where each term they entered matches at least one of the columns in the table. To start I have made a query like
SELECT * FROM contacts WHERE
firstname LIKE '%Bob%'
OR lastname LIKE '%Bob%'
OR phone LIKE '%Bob%' OR
...
But now I realize that that will completely fail on something as simple as 'Bob Jenkins' because it is not smart enough to search for the first an last name separately. What I need to do is split up the the search terms and search for them individually and then intersect the results from each term somehow. At least that seems like the solution to me. But what is the best way to go about it?
I have heard about fulltext and MATCH()...AGAINST() but that sounds like a rather fuzzy search and I don't know how much work it is to set up. I would like precise yes or no results with reasonable performance. The search needs to be done on about 20 columns by 120,000 rows. Hopefully users wouldn't type in more than two or three terms.
Oh sorry, I forgot to mention I am using MySQL (and PHP).
I just figured out fulltext search and it is a cool option to consider (is there a way to adjust how strict it is? LIMIT would just chop of the results regardless of how well it matched). But this requires a fulltext index and my website is using a view and you can't index a view right? So...
I would suggest using MATCH / AGAINST. Full-text searches are more advanced searches, more like Google's, less elementary.
It can match across multiple tables and rank them to how many matches they have.
Otherwise, if the word is there at all, esp. across multiple tables, you have no ranking. You can do ranking server-side, but that is going to take more programming/time.
Depending on what database you're using, the ability to do cross columns can become more or less difficult. You probably don't want to do 20 JOINs as that will be a very slow query.
There are also engines such as Sphinx and Lucene dedicated to do these types of searches.
BOOLEAN MODE
SELECT * FROM contacts WHERE
MATCH(firstname,lastname,email,webpage,country,city,street...)
AGAINST('+bob +jenkins' IN BOOLEAN MODE)
Boolean mode is very powerful. It might even fulfil all my needs. I will have to do some testing. By placing + in front of the search terms those terms become required. (The row must match 'bob' AND 'jenkins' instead of 'bob' OR 'jenkins'). This mode even works on non-indexed columns, and thus I can use it on a view although it will be slower (that is what I need to test). One final problem I had was that it wasn't matching partial search terms, so 'bob' wouldn't find 'bobby' for example. The usual % wildcard doesn't work, instead you use an asterisk *.
Say if I had a table of books in a MySQL database and I wanted to search the 'title' field for keywords (input by the user in a search field); what's the best way of doing this in PHP? Is the MySQL LIKE command the most efficient way to search?
Yes, the most efficient way usually is searching in the database. To do that you have three alternatives:
LIKE, ILIKE to match exact substrings
RLIKE to match POSIX regexes
FULLTEXT indexes to match another three different kinds of search aimed at natural language processing
So it depends on what will you be actually searching for to decide what would the best be. For book titles I'd offer a LIKE search for exact substring match, useful when people know the book they're looking for and also a FULLTEXT search to help find titles similar to a word or phrase. I'd give them different names on the interface of course, probably something like exact for the substring search and similar for the fulltext search.
An example about fulltext: http://www.onlamp.com/pub/a/onlamp/2003/06/26/fulltext.html
Here's a simple way you can break apart some keywords to build some clauses for filtering a column on those keywords, either ANDed or ORed together.
$terms=explode(',', $_GET['keywords']);
$clauses=array();
foreach($terms as $term)
{
//remove any chars you don't want to be searching - adjust to suit
//your requirements
$clean=trim(preg_replace('/[^a-z0-9]/i', '', $term));
if (!empty($clean))
{
//note use of mysql_escape_string - while not strictly required
//in this example due to the preg_replace earlier, it's good
//practice to sanitize your DB inputs in case you modify that
//filter...
$clauses[]="title like '%".mysql_escape_string($clean)."%'";
}
}
if (!empty($clauses))
{
//concatenate the clauses together with AND or OR, depending on
//your requirements
$filter='('.implode(' AND ', $clauses).')';
//build and execute the required SQL
$sql="select * from foo where $filter";
}
else
{
//no search term, do something else, find everything?
}
Consider using sphinx. It's an open source full text engine that can consume your mysql database directly. It's far more scalable and flexible than hand coding LIKE statements (and far less susceptible to SQL injection)
You may also check soundex functions (soundex, sounds like) in mysql manual http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_soundex
Its functional to return these matches if for example strict checking (by LIKE or =) did not return any results.
Paul Dixon's code example gets the main idea across well for the LIKE-based approach.
I'll just add this usability idea: Provide an (AND | OR) radio button set in the interface, default to AND, then if a user's query results in zero (0) matches and contain at least two words, respond with an option to the effect:
"Sorry, No matches were found for your search phrase. Expand search to match on ANY word in your phrase?
Maybe there's a better way to word this, but the basic idea is to guide the person toward another query (that may be successful) without the user having to think in terms of the Boolean logic of AND and ORs.
I think Like is the most efficient way if it's a word. Multi words may be split with explode function as said already. It may then be looped and used to search individually through the database. If same result is returned twice, it may be checked by reading the values into an array. If it already exists in the array, ignore it. Then with count function, you'll know where to stop while printing with a loop. Sorting may be done with similar_text function. The percentage is used to sort the array. That's the best.