MySQL: How to search for spelling variants? ("murrays", "murray's" etc) - php

I want to search like this: the user inputs e.g. "murrays", and the search result will show both records containing "murrays" and records containing "murray's". What should I do in my query.pl?

What do you think about using the SOUNDEX function and the SOUNDS LIKE operator ?
That way, you can simply do:
SELECT * from USERS WHERE name SOUNDS LIKE 'murrays'
I'm pretty sure it doesn't work for every case, and perhaps it is not the most efficient way to solve the problem, but it could fit your needs.

This won't help if you absolutely need to do these queries in SQL, but if you can set up a Lucene search index for it, you gain a lot of this kind of "fuzzy search" functionality. Note though that Lucene is quite a complex topic by itself.

What you could do is create an extra field in the database, which contains the data with all special characters stripped from it, and search there. A bit lame, I know. Looking forward to see smarter answers ;)

Quick and dirty:
SELECT * FROM myTable WHERE REPLACE(name, '\'', '') = 'murrays'

I would first build a search column which has the text without punctuation and then search on that. Otherwise you'll have have to have a series of regular expressions to search against or check individual records in PHP for matching: both of which are computational intensive operations.

Maybe something like this: (untested!)
SELECT * FROM users WHERE REPLACE(user_name, '\'', '') = "murrays"

If this is for single word searching, you could try using Soundex or Metaphone functions? These would handle sounds-like as well as spelling
Not sure if MySQL has these, but PHP does (which would require separate columns to hold these values).
Otherwise, Richy's no-punctuation extra column seems best.

You could try adding a replace to your query like this
replace(name, '''','')
to temporarily get rid of the apostrophes for the match.
select name from nametable where name = replace(name,'''','');

This query should be able to pick up "murrays" or "murray's".
var inputStr = "murrays";
inputStr = String.Replace("'", "\'", inputStr);
SELECT * FROM ATable WHERE Replace(AField, '\'', '') = inputStr OR AField = inputStr

strip user input and names in database from all non-letter characters.
Use levenstein distance or soundex to find murrays with murray or marrays. This is optional but your users would love that.

Related

mySQL: How do I combine a search value with a variable?

I'm sure that there is a stupidly simple solution to this, but unfortunately my google-fu is too weak to find it.
I have a number of different tables for sizing, all following the same naming convention i.e size_001, size_002 etc. Within a loop I need to get the size entry that matches with the results already found.
Unfortunately there are no totally unique identifiers, as they repeat in each table (roman numerals for sizing). But they are unique in each individual table. So what I've tried so far looks a little bit like this:
SELECT * FROM CONCAT('size_00', '.$sizeTableID[$j].') WHERE sizeName LIKE '$sizeNames[$j]'"
Where $sizeTableId is a number from 1-9 and sizeName is a string e.g II or VI or, occasionally (because there's no consisitency), 2 etc
I've also tried ''$var'' inside the CONCAT and not using the CONCAT at all. Really I just need a way to join the database.size_00 and an integer variable.
If I understand correctly, this is actually simple:
$tablename = 'size00'.$sizeTableID[$j];
$sql = "SELECT * FROM $tablename WHERE sizeName LIKE '{$sizeNames[$j]}'";
and I think that solves it.
PHP is a bit quirky here.....
Try this one (when the variable is from an array/object, surround it with {})
$sql = "SELECT * FROM CONCAT('size_00', '{$sizeTableID[$j]}') WHERE sizeName LIKE '{$sizeNames[$j]}'";

Two partial MySQL matches in a concat_ws?

Here's my use case: I'm searching for a person by first and last name, but only type in a partial first and partial last name, how can I create a WHERE clause that catches all possible scenarios?
Example, I type "Joe Smith" and it has a result. I type "Joe" and it has Joe Smith and a few other Joe's. I type "Joe Sm" and it gives me Joe Smith.
I want to be able to type "J Smit" and get Joe Smith, is that possible? Do I need to break the search term on spaces in PHP before doing a LIKE?
Here's what I have so far that works with full matches:
WHERE CONCAT_WS(' ', owner.first_name, owner.last_name)
LIKE '%". $searchTerm ."%'
Any help would be greatly appreciated.
Why don't you do an explode(' ',$input) on your input in PHP and then compare all values of that array in your WHERE clause?
$inputArray = explode(' ',$input);
foreach ($inputArray as $part)
{
$whereArray = "CONCAT_WS(' ',owner.first_name,owner.last_name) LIKE '%$part%'";
}
$where = implode(' AND ',$whereArray);
And then use it like this:
$query = "SELECT * FROM owner WHERE $where";
Please pay attention to security, I didn't do that.
This still doesn't quite do what you want. Because when you want to search for "J Smit" you want the system to be intelligent enough, to search one part, say "J" in the first name column and the other part "Smit" in the last name column. Clearly that's more complex, and the complexity increases with the number of parts to match. There is a solution for that, but you won't like it, it's ugly.
Has anybody got a, not so ugly, solution to this?
It sounds like you do want split the search term into a first and last name component, and then run LIKE comparisons against owner.first_name and owner.last_name separately. Unfortunately, I don't know of native mySQL support for straightforward string splitting.
Splitting in PHP first is certainly an option (the answer from #KIKOSoftware seems to do a good job of that). If you want to try to do it all in mySQL as an alternative, this SO question offers some insight (you will have to modify for your use case, since you're delimiting on white space instead of commas):
How to split the name string in mysql?

LIKE Condition in PHP Not Work correctly

i have a row in my database with name "active_sizes" and i want filter my website items by size, for this, i use LIKE Condition in php :
AND active_sizes LIKE '%" . $_GET['size'] . "%'
but by using this code i have problem
for example when $_GET['size']=7.0 this code shows items that active_sizes=17.0
my active_sizes value looks like 17.0,5.0,6.5,7.5,,
thanks
Using comma-separated values in a single field in a database is indicative of bad design. You should normalize things, and have a seperate "item_sizes" table. As it stands now, you need a VERY ugly where clause to handle such sub-string mismatches:
$s = (intval)$_GET['size'];
... WHERE (active_sizes = $s) // the only value in the field
OR (active_sizes LIKE '$s%,') // at the beginning of the field
OR (active_sizes LIKE '%,$s,%') // in the middle of the field
OR (active_sizes LIKE '%,$s') // at the end of the field
Or, if you normalized things properly and had these individual values in their own child table:
WHERE (active_sizes_child.size = $s)
I know which one I'd choose to go with...
You don't state which DB you're using, but if you're in MySQL, you can temporarily accomplish the same thing with
WHERE find_in_set($s, active_sizes)
at the cost of losing portability. Relevant docs here: http://dev.mysql.com/doc/refman/5.0/en/string-functions.html#function_find-in-set
You Have % signs around your $_GET value. Combined with LIKE, this means that any string that simply contains your get value will be retuned. If you want an exact match, use the = operator instead, without the percentage signs.
This will solve your immediate issue:
AND active_sizes LIKE '" . mysql_real_escape_string($_GET['size']) . "%'
If you are using the database other than MySQL, use corresponding escape function. Never trust input data.
Besides, I'd suggest using numeric field (DECIMAL or NUMERIC) for active_sizes field. This will accelerate your queries, will let you consume less memory, create queries like active_sizes BETWEEN 16.5 AND 17.5, and generally this is more correct data type for a shoe size.

having trouble search through mysql database

I have two questions regarding my script and searching. I have this script:
$searchTerms = explode(' ', $varSearch);
$searchTermBits = array();
foreach($searchTerms as $term){
$term = trim($term);
if(!empty($term)){
$searchTermBits[] = "column1 LIKE '%".$term."%'";
}
}
$sql = mysql_query("SELECT * FROM table WHERE ".implode(' OR ', $searchTermBits)."");
I have a column1 with a data name "rock cheer climbing here"
If I type in "rock climb" this data shows. Thats perfect, but if I just type "Rocks", it doesn't show. Why is that?
Also, How would I add another "column2" for the keyword to search into?
Thank you!
Searching that string for "rocks" doesn't work, because the string "rocks" doesn't exist in the data. Looking at it, it makes sense to you, because you know that the plural of "rock" is "rocks", but the database doesn't know that.
One option you could try is removing the S from search terms, but you run into other issues with that - for example, the plural of "berry" is "berries", and if you remove the S, you'll be searching for "berrie" which doesn't get you any further.
You can add more search terms by adding more lines like
$searchTermBits[] = "column1 LIKE '%".$term."%'";
and replacing ".$term." with what you want to search for. For example,
$searchTermBits[] = "column1 LIKE '%climb%'";
One other thing to note... as written, your code is susceptible to SQL injection. Take this for example... What if the site visitor types in the search term '; DROP TABLE tablename; You've just had your data wiped out.
What you should do is modify your searchTermBits[] line to look like:
$searchTermBits[] = "column1 LIKE '%" . mysql_real_escape_string($term) . "%'";
That will prevent any nastiness from harming your data.
Assuming the data you gave is accurate, it shouldn't match because you're using "Rocks" and the word in the string is "rock". By default mysql doesn't do case sensitive matching, so it's probably not the case.
Also, to avoid sql injection, you absolutely should be using mysql_real_escape_string to escape your content.
Adding a second column would be pretty easy as well. Just add two entries to your array for every search term, one for column1 and one for column2.
Your column1 data rock cheer climbing here your search criteria %Rocks% it doesn't fit at all as rocks is not in your column1 data
you can add column2 as you do for column1 then put it all together by using an AND operator (column1 LIKE "%rock%" OR column1 LIKE "%climb%") AND (column2 LIKE "%rope%" OR column2 LIKE "%powder%")
TIPS:
If your table/schema are using xx_xx_ci collation (then this is mean case insensitive,mysql doesn't care case sensitive) but if other then you need to make sure that the search term must be case sensitive(mysql do case sensitive).

MySQL LIKE question

I have a script:
$friendnotes = mysql_query("SELECT nid,user,subject,message FROM friendnote
WHERE tousers LIKE '%$userinfo[username]%' ");
And the content in the "tousers" table of the database:
Test
Example
User
That script appears to be working well
However, if there is a user called "Test2", it would also display content that has "Test2" in the database where $userinfo[username] is just "Test"
Is there any way to fix that problem? For example (this is just an example, I don't mind if you give another way) make it so that it searches whole lines?
EDIT: I don't think anyone understands, the "tousers" table contains multiple values (seperated by line) not just one, I want it to search each LINE (or anything that works similiar), not row
The condition
tousers LIKE '%Test%'
means that touser contains "Test" at some point, so it is true for "Test","MyTest","Test3","MyTest3", and so on.
If you want only to match the current user, try
... WHERE tousers = '$userinfo[username]'
EDIT If you really want to store multiple names in one column (separated by newlines), you could use a REGEXP pattern like
WHERE tousers REGEXP '(^|\\n)($userinfo[username])($|\\n)'
Be aware to make sure that $userinfo[username] does not contain any regular-expression-like characters ('$', '^', '|', '(', etc.). Also (as mentioned in the comments above) this solution is suboptimal in terms of security/performance/etc: It would be better to model an 1:n-Relationship between the friendnote table and some friendnotes_user table ...
Ok, so it sounds like the tousers field can contain values like 'stuff test option whatever' and 'foo test2 something blah blah', and you want to match the first but not the second. In that case, you need to include the delimiters around your search term. Assuming the search term will always have a space before and either a space or comma after it, you could do something like:
... WHERE tousers LIKE '%[ ]$userinfo[username][ ,]%'
This will encounter problems, however, if your search term can occur at the beginning of the field (no space character before it) or at the end of the field (no delimiter after it). In that case, you might need to have multiple LIKE clauses.
This will work if you remove the % signs, which are what allow for pattern matching.
$friendnotes = mysql_query("SELECT nid,user,subject,message FROM friendnote
WHERE tousers LIKE '$userinfo[username]' ");
But the consensus seems to be that using equals will be faster. See https://stackoverflow.com/questions/543580/equals-vs-like.
So in that case, change to
$friendnotes = mysql_query("SELECT nid,user,subject,message FROM friendnote
WHERE tousers = '$userinfo[username]' ");
Edit - regarding your edit, that is not a really good design. If a user can have multiple "tousers" (ie a one-to-many relationship), that should be represented as a separate table tousers, where each row represents one "touser" and has a foreign key on the user id to match it with the friendnote table. But if you absolutely can't change your design, you might want to match like this:
WHERE tousers LIKE '%$userinfo[username]\n%' ");
ensuring that there is a line break immediately following the username.
From what I understand, you should just use strict comparison:
where tousers = 'whatever'
That is because tousers like %whatever% matches any row, in which the tousers field has 'whatever' anywhere in its content, so it matches 'whatever', '123whatever', 'whatever321' and '123whatever321'. I hope you get the idea.
So you only want to search for exact name matches? If so, just use an = and remove the % wildcards:
$friendnotes = mysql_query("SELECT nid,user,subject,message FROM friendnote
WHERE tousers = '$userinfo[username]' ");
This is a perfect usage case for the MySQL REGEXP operator.

Categories