Well i currently want to do a search engine with SQL and PHP. I crrently use the following query:
SELECT * FROM info WHERE name LIKE '%$q%' LIMIT 10
But i want to select the info with 'name' that start with $q, not the ones that cointain $q.
Simply remove the first wildcard (%):
SELECT * FROM info WHERE name LIKE 'X%' LIMIT 10
If you are wanting to search fields that have multiple names and you want to search for a name that might be in the middle of the field, then it might be better to use Full Text Search (FTS). For example, if the name field in the OP refers to a full name and could contain "John Doe" and you want to search for "Doe", then FTS is probably going to be better (more accurate and faster) than using a LIKE operator. With FTS, the individual words in text are indexed and so searches can be very fast, plus it allows for more complex searches such as the ability to find two words (or names) in a field that are not adjacent (to pick one example at random).
The specific database vendor is not mentioned in the OP, but if FTS is supported, then it might be a good choice for what you are attempting. Some FTS information for SQL Server and for MySQL is easily found with a search.
Related
When I enter word (ex: freshers) I want fresher and freshers records both currently I am getting only freshers records. My code is like this:
$search='freshers';
$qry=mysql_query("select count(*) from jobs where job_title like '%$search%' or MATCH(job_title)
AGAINST('$search' IN BOOLEAN MODE)");
when the search word is freshers I m getting count as 1200. When the search word is fresher I am again getting count as 2000.
How to get almost same count when I enter either freshers or fresher.
You cannot get precisely the same matching-record count from any MySQL technology with a search term that's either singular or plural.
MySQL doesn't have the smarts to know that freshers is the plural of fresher or children is the plural of child. If you want to do this with MySQL you'll have to start your search with the singular form of the word for which you want the plural.
Neither does MySQL know that mice is the plural of mouse.
If you need automated plural/singular functionality you may want to investigate Lucene or some other natural language search tech. The name of the capability you seek is "stemming."
But you can use FULLTEXT search with terms with trailing asterisks. For example, 'fresher*' matches fresher, freshers, and even fresherola. This will extend a search from singular to plural. It will not work the other way around. It
select count(*)
from jobs
where MATCH(job_title) AGAINST('fresher*' IN BOOLEAN MODE)
There are some other modifying characters for boolean mode search terms. They are mentioned here:
http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html
Pro tip: column LIKE '%searchterm%' is probably the slowest way MySQL offers to search a column. It is guaranteed to scan the whole table.
Pro tip: FULLTEXT search is inherently a bit fuzzy. Expecting crisp record counts from it is probably a path to confusion.
try this
$search='fresher';
$qry=mysql_query("select count(*) from jobs where job_title like '$search%'
I have a table called users which contains the columns firstname and lastname.
I am struggling on how to structure the WHERE clause to return results on matched first name, last name, first name plus a space and the last name, and last name plus a comma and the first name. For instance, "John", "Doe", "John Doe", and "Doe, John". It should also work for partial patches (i.e. "John D"). I am thinking of something like the following (substitute ? with the search phrase).
SELECT * FROM people WHERE
firstname LIKE ?
OR CONCAT_WS(" ",firstname,lastname) LIKE ?
OR lastname LIKE ?
OR CONCAT_WS(", ",lastname,firstname) LIKE ?
There is a little flaw with this approach as searching on "John Doe" will also return "John Enders", "John Flaggens", and "John Goodmen", so I will need to add some conditional to just return results when both first and last name match when both are given.
There is also a big flaw with this approach as I have functions in my WHERE clause which prevent the use of indexes and result in significantly reduce performance.
Can I effectively do this using just SQL without using functions in my WHERE clause, or should I user server code (ie PHP) to parse the given search phrase for spaces and commas and create a dynamic SQL query based on the results?
You ask if you can efficiently do this within MySQL, given your schema design without using FULLTEXT search the simple answer is no, you can't do this efficiently.
You could create some wacky query using LIKE / STRCMP and / or string functions. If you were to take this approach it will be much better to write some application logic to create the query inline rather than trying to write one query that can handle everything. Either way it’s not likely be truly efficient
MySQL 5.6 has the ability to perform FULLTEXT searches within INNODB. If you’re using a lower version you can only do this with MyISAM tables – there are many reasons why MyISAM may not be a good idea.
Another approach is to look at a real search solution such as Lucene, Sphinx or Xapian
Have you tried regular expressions http://dev.mysql.com/doc/refman/5.1/en/regexp.html
I have a MySQL table storing some user generated content. For each piece of content, I have a title (VARCHAR 255) and a description (TEXT) column.
When a user is viewing a record, I want to find other records that are 'similar' to it, based on the title/description being similar.
What's the best way to go about doing this? I'm using PHP and MySQL.
My initial ideas are:
1) Either to strip out common words from the title and description to be left with 'unique' keywords, and then find other records which share those keywords.
E.g in the sentence: "Bob woke up at 5 am and went to school", the keywords would be: "Bob, woke, 5, went, school". Then if there's another record whose title talks about 'bob' and 'school', they would be considered 'similar'.
2) Or to use MySQL's full text search, though I don't know if this would be any good for something like this?
Which method would be better out of the two, or is there another method which is even better?
I'll keep this short (it could be way too long)...
I would not select they keywords 'manually' or modify your original data.
MySQL supports full text search with MyISAM (not InnoDB) engine. A full description of the options available when querying the DB are available here. The query can automatically get rid of common stop-words and words too common in the data set (more than 50% of the rows contains them) depending on the querying method. Query expansion is also available and the query type should be decided depending on your needs.
Consider also using a separate engine like Lucene. With Lucene you will probably have more functionalities and better indexing/searching. You can automatically get rid of common words (they get a low score and do not influence the search) and use things as stemming for instance. There is a little bit of a learning curve but I'll definitely look into it.
EDIT:
The MySQL 'full-text natural language search' returns the most similar rows (and their relevance score) and is not a boolean matching search.
You would start by defining what similar means to you and how you want to score the similarity between two different documents.
Using that algorithm you can processing all your documents and build a table of similarity scores.
Depending on the complexity of your scoring algorithm and size of data set, this may not be something you would run realtime, but instead batch it through something like Hadoop.
I have done something like this. I replace all of the spaces in the string with % then use LIKE in the where clause. Here, I will give you my code. It is from MSSQL but minor adjustments can be made to work it with MySQL. Hope it helps.
CREATE FUNCTION [dbo].[fss_MakeTextSearchable] (#text NVARCHAR(MAX)) RETURNS NVARCHAR(MAX)
--replaces spaces with wildcard characters to return more matches in a LIKE condition
-- for example:
-- #text = 'my file' will return '%my%file%'
-- SELECT WHERE 'my project files' like #text would return true
AS
BEGIN
DECLARE #searchableText NVARCHAR(MAX)
SELECT #searchableText = '%' + replace(#text, ' ', '%') + '%'
RETURN #searchableText
END
Then use the function like this:
SELECT #searchString = dbo.fss_MakeTextSearchable(#String)
Then in your query:
Select * from Table where title LIKE #searchString
I am building a search feature for the messages part of my site, and have a messages database with a little over 9,000,000 rows, and and index on the sender, subject, and message fields. I was hoping to use the LIKE mysql clause in my query, such as (ex)
SELECT sender, subject, message FROM Messages WHERE message LIKE '%EXAMPLE_QUERY%';
to retrieve results. unfortunately, MySQL doesn't use indexes when a leading wildcard is present , and this is necessary for the search query could appear anywhere in the message (this is how the wildcards work, no?). Queries are very very slow and I cannot use a full text index either, because of the annoying 50% rule (I just can't afford to rule that much out). Is there anyway (or even, any alternative to this) to optimize a query using like and two wildcards? Any help is appreciated.
You should either use full-text indexes (you said you can't), design a full-text search by yourself or offload the search from MySQL and use Sphinx/Lucene. For Lucene you can use Zend_Search_Lucene implementation from Zend Framework or use Solr.
Normal indexes in MySQL are B+Trees, and they can't be used if the starting of the string is not known (and this is the case when you have wildcard in the beginning)
Another option is to implement search on your own, using reference table. Split text in words and create table that contains word, record_id. Then in the search you split the query in words and search for each of the words in the reference table. In this way you are not limitting yourself to the beginning of the whole text, but only to the beginning of the given word (and you'll match the rest of the words anyway)
'%EXAMPLE_QUERY%'; is a very very bad idea .. am going to give you some
A. Avoid wildcards at the start of LIKE queries use 'EXAMPLE_QUERY%'; instead
B. Create Keywords where you can easily use MATCH
If you want to stick with using MySQL, you should use FULL TEXT indexes. Full text indexes index words in a text block. You can then search on word stems and return the results in order of relevance. So you can find the word "example" within a block of text, but you still can't search efficiently on "xampl" to find "example".
MySQL's full text search is not great, but it is functional.
http://dev.mysql.com/doc/refman/5.1/en/fulltext-search.html
select * from emp where ename like '%e';
gives emp_name that ends with letter e.
select * from emp where ename like 'A%';
gives emp_name that begins with letter a.
select * from emp where ename like '_a%';
gives emp_name in which second letter is a.
I need to implement a search option for user comments that are stored in a MySQL database. I would optimally like it to work in a similar manner to a standard web page search engine, but I am trying to avoid the large scale solutions. I'd like to just get a feel for the queries that would give me decent results. Any suggestions? Thanks.
It's possible to create a full indexing solution with some straightforward steps. You could create a table that maps words to each post, then when you search for some words find all posts that match.
Here's a short algorithm:
When a comment is posted, convert the string to lowercase and split it into words (split on spaces, and optionally dashes/punctuation).
In a "words" table store each word with an ID, if it's not already in the table. (Here you might wish to ignore common words like 'the' or 'for'.)
In an "indexedwords" table map the IDs of the words you just inserted to the post ID of the comment (or article if that is what you want to return).
When searching, split the search term on words and find all posts that contain each of the words. (Again here you might want to ignore common words.)
Order the results by number of occurrences. If the results must contain all the words you'd need to find the union of your different arrays of posts.
As an entry point, you can use MySQL LIKE queries.
For example if you have a table 'comments' with a column named 'comment', and you want to find all comments that contain the word 'red', use:
SELECT comment FROM comments WHERE comment LIKE '% red %';
Please note that fulltext searches can be slow, so if your database is very large or if you run this query a lot, you will want to find an optimized solution, such as Sphinx (http://sphinxsearch.com).