I have a search text field which searches a particular column in a table, but the results are not as I expected.
When the user tries to search like "hello world how", he will not find a result as I the query is LIKE '%hello world how%'. The table row contains the string "hello world".
How do I do a proper search using php and mysql, and what if I need to do a search on multiple tables/ all columns in a table? Which is the best way to do it?
One bad way to do this would be to split the user's search text on whitespace. "hello world how" would become "hello%world%how". However, this still would require the word "how" to be in there, after "hello world", and would not guarantee that "hello" and "world" are near each other.
Further, even if you've put an index on the column being searched, a LIKE clause that stars with a wildcard character (%) can not use that index in MySQL. This means that every search would be a full table scan. That can get pretty slow.
One solution for you might be MySQL fulltext search. It's pretty flexible. However, it only works with MyISAM tables. You probably want to use InnoDB for your data, because MyISAM does not support foreign keys or transactions. You can create separate, dedicated tables that use MyISAM and fulltext indexing, just for searching purposes.
Sphinx is another option available to you. It's a third party search system that can be attached to MySQL and PHP.
All of these answers, however, are focused around searching one column at a time. If you need to search entire rows, that becomes a little more interesting. You might want to consider a "document"-based search system, like ElasticSearch.
Related
I am working on a project where I have a database, that contains a summary field, which is filled in by a web form that visitors to the site enter on.
When the user completes entering the summary field, I want to perform a lookup using the words that were entered by the the user on the page for similar records in the database that contain the same keywords that they've filled in on the page.
I was thinking I could split the summary string that is submitted and then loop through the array and build up a query so the query would end up something like:
SELECT *
FROM my_table
WHERE summary LIKE '%keyword1%'
OR summary LIKE '%keyword2'
OR summary LIKE '%keyword3%';
However, this seems massively inefficient, and as the database could grow quite big, could potentially become quite a slow query to run.
I then found the MySQL IN clause, but this only seems to work with multiple values where a field can only contain 1 of these values in a row.
Is there a way I can use the IN function, or is there a better MySQL function that I can use to do what I want, or is my first idea the only way round it?
An example of what I am trying to achieve is a bit like on Stack Overflow. When you lose focus of the title field, it pops up similar questions based on the title you've provided.
I would recommend reading this manual page InnoDB FULLTEXT Indexes and the one on Full-Text Restrictions. New functionality of full text has been incorporated in recent releases of mysql, augmenting the use of it with INNODB tables.
Concerning the inability to upgrade a mysql version, there is no reason why one cannot mix and match MyISAM and INNODB tables in the same db. As such, one would keep textual information in MyISAM (where historically FTS index power was available), and doing joins to INNODB tables when needed. This avoids the "must upgrade to version 5.6" argument.
Legend: FTS=Full Text Search
I'm building a database of IT candidates for a friend who owns a recruitment company. He has a database of thousands of candidates currently in an excel spreadsheet and I'm converting it into mySQL database.
Each candidate has a skill field with their skills listed as a string e.g. "javascript, php, nodejs..." etc.
My friend will have employees under him who will also search the database, however we want to make it so they are limited to search results with candidates with specific skills depending on what vacancy they are working on for security reasons (so they don't steal large sections of the database and go and setup their own recruitment company with the data).
So if an employee is working on a javascript role, they will be limited to search results where the candidate has the word "javascript" in their skills field. So if they searched for all candidates named "Michael" then it would only return "Michaels" with javascript skills for instance.
My concern is that the searches might take too long if for every search since it must scan the skills field which can sometimes be a long string.
Is my concern justified? If so is there a way to optimize this?
If the number of records are in the thousands, you probably won't have any speed issues (just make sure you're not querying more often than you should).
You've tagged this question with a 'mysql' tag so I'm assuming that's the database you're using. Make sure you add a FULLTEXT index to speed up the search. Please note, however, that this type of index is only available for INNODB table starting with MySQL 5.6.
Try the builtin search first, but if you find it to be too slow, or not accurate enough in it's results, you can look at external full-text search engines. I've personally had very good experience with the Sphinx search server, where it easily indexed millions of text records and returned good results.
Your queries will require a full table scan (unless you use a full text index). I highly recommend that you change the data structure in the database by introducing two more tables: Skills and CandidateSkills.
The first would be a list of available skills, containing rows such as:
SkillId SkillName
1 javascript
2 php
3 nodejs
The second would say which skills each person has:
CandidateId SkillId
1 1
2 1
2 2
This will speed up the searches, but that is not the primary reason. The primary reason is fix problems and enable functionality such as:
Preventing spelling errors in the list of searchs.
Providing a basis for enabling synonym searches.
Making sure thought goes into adding new skills (because they need to be added to the Skills table.
Allowing the database to scale.
If you attempt to do what you want using a full text index, you will learn a few things. For instance, the default minimum word length is 4, which would be a problem if your skills include "C" or "C++". MySQL doesn't support synonyms, so you'd have to muck around to get that functionality. And, you might get unexpected results if you have have skills that are multiple words.
I just came over this site: http://www.hittaplagget.se. If you enter the following search word moo, the autosuggest pops up immediately.
But if you go to my site, http://storelocator.no, and use the same search phrase (in "Search for brand" field), it takes a lot longer for autosuggest to suggest anything.
I know that we can only guess on what type of technology they are using, but hopefully someone here can do an educational guess better than I can.
In my solution I only do a SELECT moo% FROM table and return the results.
I have yet not indexed my table as there are only 7000 rows in it. But I'm thinking of indexing my tables using Lucene.
Can anyone suggest what I need to do in order to get equally fast autosuggest?
You must add an index on the column holding your search terms, even at 7000 - otherwise, the database searching through the whole list every time. See http://dev.mysql.com/doc/refman/5.0/en/create-index.html.
Lucene is a full text search index and may or may not be what you're looking for. Lucene would find any occurrence of "moo" in the entire indexed column (e.g. Mootastic and Fantasticmoo) and does not necessarily speed up your search although it's faster than a where x like '%moo%' type of search.
As others have already pointed out a regular index (probably even unique?) is what you want if you're performing "starts with" type of searches.
You will need to table-scan the table, so I suggest:
Don't put any rows in the table you don't need - for example, "inactive" records - keep them in a different table
Don't put any columns in the table you don't need
You can achieve this by having a special "Search table" which just contains the rows/columns you're interested in, and updating it from the "Master table".
Table-scanning a 7000 row table should be extremely efficient if the rows are small; I understand from your problem domain that this will be the case.
But as others have pointed out - don't send the 7000 rows to the client-side when it doesn't need it.
A conventional index can optimise a LIKE 'someprefix%' into a range-scan, so it is probably helpful having one. If you want to search for the string in any part of the entry, it is going to be a table-scan (which should not be slow on such a tiny table!)
I want to build a product-search engine.
I was thinking of using google-site-search but that really searches Google's index of your site. I do not want to search that. I want to search a specific table (all the fields, even ones the user never sees) on my data-base for given keywords.
But I want this search to be as robust as possible, I was wondering if there was something already out there I could use? if not whats the best way to go about making it myself?
You can try using Sphinx full-text search for MySQL.
Here's also a tutorial from IBM using PHP.
I'd focus on MySQL Full-Text search first. Take a look at these links:
http://dev.mysql.com/doc/refman/4.1/en/fulltext-search.html
http://dev.mysql.com/doc/refman/5.1/en/fulltext-boolean.html
Here is a snippet from the first link:
Full-text searching is performed using
MATCH() ... AGAINST syntax. MATCH()
takes a comma-separated list that
names the columns to be searched.
AGAINST takes a string to search for,
and an optional modifier that
indicates what type of search to
perform. The search string must be a
literal string, not a variable or a
column name. There are three types of
full-text searches:
As far as stuff that's already out there, take a look at these :
Search all tables (for SQL Server, but you could probably adapt it to MySQL)
Another search all tables (for SQL Server, but you could probably adapt it to MySQL)
Search all varchar columns in database
MySQL Full-Text Search
Using MySQL Full-Text Search
SELECT * FROM table WHERE value REGEXP 'searchterm'
Allows you to use many familiar search tricks such as +, "", etc
This is a native function of MySQL. No need to use go to a new language or plugin which might be faster, but is also extra time for maintenance, troubleshooting, etc.
It may be a little slower than doing some crazy C++ based mashup, but users don't generally notice a difference between milliseconds......
One thing you might also want to look into (if you're not going to utilize sphinx), is stemming your keywords. It will make matching keywords a bit easier (as stemming 'cheese' and 'cheesy' would end up producing the same stemmed word) which makes your keyword matching a bit more flexible.
I've got a mysql dataset that contains 86 million rows.
I need to have a relatively fast search through this data.
The data I'll be searching through is all strings.
I also need to do partial matches.
Now, if I have 'foobar' and search for '%oob%' I know it'll be really slow - it has to look at every row to see if there is a match.
What methods can be used to speed up queries like this?
I don't think fulltext search will allow for partial matches (I could be wrong on this).
You might take a look at Sphinx Search which is a full-text and partial match search engine system. You can easily import your mysql data into it, and then use simple PHP queries to search the data. It is far more efficient than using MySQL to do the query.
What you seek is Full Text Search. This would enable you to do partial matches quickly and searches on multiple columns quickly.