Perform accent insensitive fulltext search MySQL - php

I'm currently developing a search functionality for a website. Users search for other users by name. I'm having some trouble getting good results for users that have accents on their name.
I have a FULLTEXT index on the name column and the table's collation is utf8_general_ci.
Currently if somebody registers for the site, and has a name with accents (for example: Alberto Andrés), the name is stored in the DB as shown in the following image:
So if I perform the following query SELECT * MATCH(name) AGAINST('alberto andres') I get lots of results with better match scores like 'Alberto', 'Andres', 'Andrés' and finally with a low match score the record the user is probably looking for 'Alberto Andrés'.
What could I do to take into account the way accented records are currently stored in the DB?
Thanks!

It looks to me like the surname of el Señor Andrés is actually stored correctly. The rendering you showed us is the way some non-UTF apps mangle UTF8 text.
You might try this modification of your query if you don't yet have a whole bunch of records in your table. Fulltext (non-boolean) mode works weirdly on small data sets.
SELECT *
FROM TABLE
WHERE MATCH(name) AGAINST('alberto andres' IN BOOLEAN MODE)
You also might try
SELECT *
FROM TABLE
WHERE MATCH(name) AGAINST(CONVERT('alberto andres' USING utf8))
just to make sure your match string is in the same character set as your MySQL columns.

Related

MySQL 5.7.9 fulltext research in multiple tables with Sphinx on WAMP

I I have to do a research in multiple MySQL tables for my internship.
In fact, I do this for a web phone directory. I have a form text input to enter the research.
I tried to use the MATCH/AGAINST syntax but it appears to be wrong.
My query is actually that one :
SELECT U_ID, 'users'
from users
where match ([columns that I want to search in]) AGAINST ([The text inside my search field])
UNION
SELECT S_ID, 'service'
from service
where match ([columns that I want to search in]) AGAINST ([The text inside my search field])
This problem is the following : With this type of search, I must send the variable in many MATCH so I can't have a relevant result (because of the multiple against elements). The perfect solution could be to replace the 'UNION' by an 'INTER' but It would be too easy.
I don't know if it will be usefull, but I use PDO to send my query with PhP
I tried to search solutions but I couldn't find one for me :
https://dev.mysql.com/doc/refman/5.5/en/fulltext-search.html
Using Full-Text Search in SQL Server 2008 across multiple tables, columns
MySQL fulltext search on multiple tables with different fields
Then I tried to use Sphinx but the documentation is complicated to me and I couldn't understand it (http://sphinxsearch.com/docs/current.html).
Can someone help me to find the query that I need or can you give me a link of a very clear and simple Sphinx tutorial on Windows (I have read the IBM one) ?
Edit :
I wanted to illustrate my problem with the set theory (inter mean intersection).
For example, when I type "John accounting department" in my form input, I want to display all users named John, but only if they belong to the accounting department.
With my actual search, I will have the id of all the departments named : "John" or "accounting" or "department" and all the id for the people named "John" or accounting" or department".
That's my actual problem.
I think a materialized view would be the easiest way to solve this in MySQL.
They not strictly a feature of mysql (ie mysql cant maintain the view automatically, its not really a view) its a normal table, that just happens to be a copy of other table(s) data.
Can actually be created quite easily...
CREATE TABLE my_search_view (fulltext(U_Name,S_Name))
SELECT U_ID, U_Name, S_ID, S_Name
FROM user
INNER JOIN service USING (S_ID);
(If you update the source tables, just delete the table, and recreate it! Although can use triggers if want a more dynamic solution, but depending size of data it possibly not worth the effort)
Now you can just run a simply query against this table (it has a full-text index)
SELECT *
FROM my_search_view
WHERE MATCH (U_Name,S_Name) AGAINST ('John accounting department')

Mysql full text search for arabic

I'm trying to do a full text search on utf8_unicode_ci encoded tables using MySQL with MyISAM engine.
Here is the query:
$sql = "select distinct u.userId,u.rite,u.link,u.reg,u.prenom,u.nom,u.pere,u.birthday,u.sex,
u.mazhab,u.mere,u.martialStat,u.voted,MATCH(`nom`,`prenom`) AGAINST (':str' IN BOOLEAN MODE)
AS relevance FROM users as u,users_responsibles as ur,users_delegates as ud where MATCH(`nom`,`prenom`)
AGAINST (':str' IN BOOLEAN MODE) and u.userId=ur.userId and u.userId=ud.userId";
It searches for a certain user by his name, while being connected to a responsible and a delegate hence the users_responsibles users_delegates join
Now this works fine for only a handful of names, while it either gets 0 rows or an unsuspected result like returning 'mike' while searching for 'alice' (in arabic)
I've been searching for a while now and all the answers are from 2012 at most. I tried changing ft_min_word_len to 3 like suggested in seome comments with no luck.
I read many answers where it sais mysql fulltext doesn't index characters with >1000 byte encoding.
Is their any solution for this while still using MySQL ? If not, is their any other way to do this?

Mysql search results for similar sounds

Suppose I have a user table. One of a column of the table store for user first name.
Also suppose there are there rows in the table. The user first names are as follows :
'Suman','Sumon','Papiya'.
Now I want a mysql query if an user search from the table by user first name with 'Suman' then the result will shows two rows one for 'Suman' and another for 'Sumon'.
You can use soundex it will compare if the sound of values in firstname matches to the sound of provided word
According to docs
When using SOUNDEX(), you should be aware of the following
limitations:
This function, as currently implemented, is intended to work well with strings that are in the English language only. Strings in other
languages may not produce reliable results.
This function is not guaranteed to provide consistent results with strings that use multi-byte character sets, including utf-8.
select *
from t
where soundex(firstname)=soundex('Suman')
Demo

How to search singular/plurals in php mysql

When I enter word (ex: freshers) I want fresher and freshers records both currently I am getting only freshers records. My code is like this:
$search='freshers';
$qry=mysql_query("select count(*) from jobs where job_title like '%$search%' or MATCH(job_title)
AGAINST('$search' IN BOOLEAN MODE)");
when the search word is freshers I m getting count as 1200. When the search word is fresher I am again getting count as 2000.
How to get almost same count when I enter either freshers or fresher.
You cannot get precisely the same matching-record count from any MySQL technology with a search term that's either singular or plural.
MySQL doesn't have the smarts to know that freshers is the plural of fresher or children is the plural of child. If you want to do this with MySQL you'll have to start your search with the singular form of the word for which you want the plural.
Neither does MySQL know that mice is the plural of mouse.
If you need automated plural/singular functionality you may want to investigate Lucene or some other natural language search tech. The name of the capability you seek is "stemming."
But you can use FULLTEXT search with terms with trailing asterisks. For example, 'fresher*' matches fresher, freshers, and even fresherola. This will extend a search from singular to plural. It will not work the other way around. It
select count(*)
from jobs
where MATCH(job_title) AGAINST('fresher*' IN BOOLEAN MODE)
There are some other modifying characters for boolean mode search terms. They are mentioned here:
http://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html
Pro tip: column LIKE '%searchterm%' is probably the slowest way MySQL offers to search a column. It is guaranteed to scan the whole table.
Pro tip: FULLTEXT search is inherently a bit fuzzy. Expecting crisp record counts from it is probably a path to confusion.
try this
$search='fresher';
$qry=mysql_query("select count(*) from jobs where job_title like '$search%'

MySQL Match Against Reserved Word in Field

In a database I work with, there are a few million rows of customers. To search this database, we use a match against Boolean expression. All was well and good, until we expanded into an Asian market, and customers are popping up with the name 'In'. Our search algorithm can't find this customer by name, and I'm assuming that it's because it's an InnoDB reserved word. I don't want to convert my query to a LIKE statement because that would reduce performance by a factor of five. Is there a way to find that name in a full text search?
The query in production is very long, but the portion that's not functioning as needed is:
SELECT
`customer`.`name`
FROM
`customer`
WHERE
MATCH(`customer`.`name`) AGAINST("+IN*+KYU*+YANG*" IN BOOLEAN MODE);
Oh, and the innodb_ft_min_token_size variable is set to 1 because our customers "need" to be able to search by middle initial.
It isn't a reserved word, but it is in the stopword list. You can override this with ft_stopword_file, to give your own list of stopwords. 2 possible problems with these are: (1) on altering it, you need to rebuild your fulltext index (2) it's a global variable: you can't alter it on a session / location / language-used basis, so if you really need all the words & are using a lot of different languages in one database, providing an empty one is almost the only way to go, which can hurt a bit for uses where you would like a stopword list to be used.

Categories