Partial keyword searching in MySQL - php

Context:
I am trying to create a search function for my website where a user can type in full sentences and receive results back based on the matching of keywords in the sentence with words stored in a MySQL database:
**ID | Skill**
1 | Painting
2 | Carpenter
3 | Builder
For example a user may search "I want some painting to be done" and using the following MySQL query (along with a foreach and explode function) it will return ID 1 from the database:
$stmt = $mysqli->prepare ("SELECT username FROM users WHERE users.id IN (SELECT
skills.userid FROM skills WHERE skills.skill LIKE CONCAT('%',?,'%') GROUP BY
skills.skill ORDER BY CASE WHEN skills.skill LIKE CONCAT(?,'%') THEN 0 WHEN
skills.skill LIKE CONCAT('% %',?,'% %') THEN 1 WHEN skills.skill LIKE CONCAT('%',?)
THEN 2 ELSE 3 END, skills.skill)");
Exam question:
The issue I have is that if a user was to type "I want a painter" then ID 1 would not be returned. How can the query be modified to account for the fact that painting and painter are similar and so should be returned?

You can add to skills table a column called synonymous with some keywords for that skill.
For example, the "Painting" row will have a "paint painting paintor" in synonumous column.
Then you change your query to check for synonymous column insted of skill column.
This is the simples way, but requires that you put a synonymous to each skills table row.

Related

Creating a related articles query in PHP

I'm trying to make something similar to "related articles". This is what I have.
$query = "SELECT * FROM posts WHERE post_tags LIKE '%$post_tags%'";
I want it to 'select all from post where the post tags are similar to the current post's tags'.
Note: The tags look like this in the database: "tech, news, technology, iphone"
I looked into things like
$tags = explode(",", $post_tags );
But I'm not sure.
Use FullText search -- docs
SELECT * FROM `posts` WHERE MATCH(post_tags) AGAINST('$post_tags' IN BOOLEAN MODE)
Live demo
The query will require you to add a FULLTEXT index to your post_tags table column (unless you have the older MyISAM table). This query will be a lot faster than your current attempt.
query to add the index
ALTER TABLE `posts` ADD FULLTEXT INDEX `tag_search` (`post_tags`)
A better, faster approach
Change how you store the post-to-tag relationship in the DB. Your posts table should not be used to store tags because one post has many tags, but each post has only one record in the posts table. Instead, have a two other tables:
tags table
tag_id | name
1 | technology
2 | news
3 | hobbies
post_tags table
post_id | tag_id
1 | 1
1 | 3
2 | 1
Notice it's easy to tell that post_id #1 has the technology and hobbies tags. This will make your queries easier, and faster.
Even faster!
If you do want to store everything in the posts table but have even faster performance, you will need to store your tags as bit flags. For instance, if the following is true in your PHP application:
$techBit = 0b001; // number 1 in binary form
$newsBit = 0b010; // number 2 in binary form
$hobbiesBit = 0b100; // number 4 in binary form
Then it's easy to store tags in one field. A post that has technology and hobbies tag would have a value:
$tag = $techBit | $hobbiesBit; // 1 + 4 = 5
And if you wanted to search for all records with technology or hobbies, you would do:
// means: records where post_tags has either techBit or hobbiesBit turned ON
SELECT * FROM `posts` WHERE (`post_tags` & ($techBit | $hobbiesBit)) > 0
Well instead of "LIKE" you could use the "IN" clause.
$Results = join("','",$post_tags);
$SQLQuery = "SELECT * FROM galleries WHERE id IN ('$Results')";
Example: Passing an array to a query using a WHERE clause

SQL finding specific character in table

I have a table like this
d_id | d_name | d_desc | sid
1 |flu | .... |4,13,19
Where sid is VARCHAR. What i want to do is when enter 4 or 13 or 19, it will display flu. However my query only works when user select all those value. Here is my query
SELECT * FROM diseases where sid LIKE '%sid1++%'
From above query, I work with PHP and use for loop to put the sid value inside LIKE value. So there I just put sid++ to keep it simple. My query only works when all of the value is present. If let say user select 4 and 19 which will be '%4,19%' then it display nothing. Thanks all.
If you must do what you ask for, you can try to use FIND_IN_SET().
SELECT d_id, d_name, d_description
FROM diseases
WHERE FIND_IN_SET(13,sid)<>0
But this query will not be sargable, so it will be outrageously slow if your table contains more than a few dozen rows. And the ICD10 list of disease codes contains almost 92,000 rows. You don't want your patient to die or get well before you finish looking up her disease. :-)
So, you should create a separate table. Let's call it diseases_sid.
It will contain two columns. For your example the contents will be
d_id sid
1 4
1 13
1 19
If you want to find a row from your diseases table by sid, do this.
SELECT d.d_id, d.d_name, d.d_description
FROM diseases d
JOIN diseases_sid ds ON d.d_id = ds.d_id
WHERE ds.sid = 13
That's what my colleagues are talking about in the comments when they mention normalization.

Best way to do search word structure?

I want to build up a database where there will be about 400 strings. I want to make the database searchable.
The structure of the database will be like:
Brand | model |additional products | price | search words | (this is 1 string, there will about 400 strings)
There will be between 2 and 50 search words on each string. The search are done by clicking a checkbox and the marked checkboxes words will be searched for in the database.
My question is how is the best way to index all the search words?
I’m thinking of 2 ways:
In the field search words, all searchable words will be displayed like: 1GB RAM, 512GB RAM, ATA, SATA… and so on for each string. This means that ALL words will be in the same raw on a specific string separated by “,”.
Each search word will have its own row like: | search words 1| search words 2| search words 3 | search 4 words 5|….. and so on. In |search words 1| the word 1GB RAM will be. In | search words 2| the word 512GB RAM will be and so on… This means in a string maybe half the search words row will be filled with a search word.
In option 2 there will be more than 50 rows in the database and all search words in different column (1 in each column for each product). In option 1 there will be 1 row with all words in the same column for each product.
Or is there a better way to do this?
Even though another answer was accepted... I explained this idea a little more because I feel that it meets "best practices" and allows you to associate more than one word with one item while not repeating data.
You should end up with three tables:
item: item_id | Brand | model |additional products | price
word: word_id | word
item_word: item_word_id | item_id | word_id
the data would look like:
Item:
item_id brand model additional_products price
1 nokia g5 100
2 toshiba satellite 1000
word:
word_id word
1 1 GB
2 ATA
3 SATA
4 512BG RAM
item_word:
item_word_id itwm_id word_id
1 1 1
2 1 2
3 2 3
4 2 4
so that nokia had these words: 1 GB, ATA and toshiba had these words: SATA, 512BG RAM. (I realize this doesn't make much sense, it's just an example)
then query it like..
select item.*, word
from item
join item_word on item.item_id = item_word.item_id
join word on item_word.word_id = word.word_id
and filter it like...
select item.*, word
from item
join item_word on item.item_id = item_word.item_id
join word on item_word.word_id = word.word_id
where word in ('1GB RAM', '512GB RAM', 'ATA')
to see what is the most relevant result you could even try...
select item.item_id, item.brand, item.model, count(*) as word_count
from item
join item_word on item.item_id = item_word.item_id
join word on item_word.word_id = word.word_id
where word in ('1GB RAM', '512GB RAM', 'ATA')
group by item.item_id, item.brand, item.model
order by count(*) desc
for something that matches all the words provided, you would use...
select item.item_id, item.brand, item.model, count(*) as word_count
from item
join item_word on item.item_id = item_word.item_id
join word on item_word.word_id = word.word_id
where word in ('1GB RAM', 'ATA')
group by item.item_id, item.brand, item.model
having count(*)=2
where 3 is the number of words in your in statement... word in ('1GB RAM', 'ATA'). in this case it was 2.
if you just do...
item: Brand | model |additional products | price | long_word_string
then you have to do...
select *
from item
where long_word_string like '1GB RAM' or word like 'ATA'
or even...
select *
from item
where long_word_string regexp '1GB RAM|ATA'
but those are very inefficient/costly methods... and it is better to just normalize things so you're not storing extra data and killing performance trying to get it out
does that make sense? does it answer your question?
edit: my answer lost out to just two tables... i'm concerned for OP's database now.
Storing your search terms in never-ending additional columns is counter-intuitive to database "normalization". Storing everything in one column is usually the last option since it is much easier to break down search terms if you use multiple columns.
Make a separate table and join your original table to this table. Your structure would look something like this:
Original table
New table
I added a primary key column to your original table. This will make the JOIN easier. Use the following statement to join the two tables:
SELECT original_table.*
FROM original_table AS ABB2
JOIN new_table AS ABB1 ON ABB1.product_id = ABB2.id
WHERE search_word = "your search term"
The "search_word" column in the new table are the terms associated with each of your entries in your original table.
You can add "%" wildcards to your WHERE statement if you'd want fuzzy (return all results that contain your search term) search enabled.
Thanks for all the suggestions. IT was very helpfull. I think I will try go for the seperated table for key Words, but im not sure how to code this part, so will have start learn that too :)

Use search results (Like %search%), to match id's in another table in the database

I have two tables in the database, parts, and products.
I have a column in the products table with strings of ids (comma separated). Those ids match ids of the parts table.
**parts**
ID | description (I'm searching this part)
-------------------------------
1 | some text here
2 | some different text here
3 | ect...
**products**
ID | parts-list
--------------------------------
1 | 1,2,3
2 | 2,3
3 | 1,2
I'm really struggling with the SQL query on this one.
I've done the 1st part, got the id's from the parts table
SELECT * FROM parts WHERE description LIKE '%{$search}%'
The biggest problem is the comma separated structure of the the description column.
Obviously, I could do it in PHP, create an array of the the results from the parts table, use that to search the products table for id's, and then use those results to grab the row data from the parts table (again). Not very efficient.
I also tried this, but I'm obviously trying to compare two arrays here, not sure how this should be done.
SELECT * FROM `products` WHERE
CONCAT(',', description, ',')
IN (SELECT `id` FROM `parts` WHERE `description` LIKE '%{$search}%')
Can anybody help?
I would perhaps try a combination of LOCATE() and SUBSTR(). I work mainly in MSSQL which has CHARINDEX() that I think works like MySQL's LOCATE(). It is bound to be messy. Are there a variable number of elements in the parts-list field?

How to implement a Search Algorithm

This is the first time I am writing an actual search feature for my database.
The database consists of hotel names, hotel food items, hotel locations.
I would like the above three to show up during a search of a string.
Are there any common search algorithm or packages that can be used ?
EXPECTED RESULT SET:
id | name | description | table_name | rank
56 | KFC| Fried chicken | hotel | 1
12 | [food item name] | [food item description] | food_item | 2
19 | [hotel name] | [hotel description] | hotel | 3
....
Do you mean a relational database? If yes, your "search" algorithm is a WHERE clause.
Do you mean contextual search? Lucene is a great search engine implementation written in Java. This might help you marry it with Lucene:
http://www.cabotsolutions.com/2009/05/using-solr-lucene-for-full-text-search-with-mysql-db/
The answer is far more complicated if you're thinking about crawling web sites based on some criteria. Please clarify.
If you are using Microsoft SQL Server, FreeText works very well:
http://msdn.microsoft.com/en-us/library/ms176078.aspx
Let's consider you're using mysql.
Well your question is basically: how to write a query that will search hotel name, food items, and hotel location.
I guess theses 3 informations are stored in 3 different tables. The easiest way would be to simply query the 3 tables one after the other with query like theses:
SELECT * FROM hotel WHERE hotel_name LIKE "%foobar%";
SELECT * FROM hotel_food_item WHERE item_name LIKE "%foobar%";
SELECT * FROM hotel_location WHERE hotel_name LIKE "%foobar%" OR street_name LIKE "%foobar%" OR city LIKE "%foobar%";
Make sure your search term are safe from SQL injection
You may (or not) want to group the query into 1 bigger query
If your database is becoming large ( like < 100 000 line per table ), or if you have a lot or search query, you might be interested in creating a search index, or use a dedicated database intend for text search, like elastic search or something else.
Edit:
If relevance is a matter, use MATCH AGAINST:
http://maisonbisson.com/blog/post/10752/making-mysql-do-relevance-ranked-full-text-searches/
http://www.devshed.com/c/a/PHP/Using-Relevance-Rankings-for-Full-Text-and-Boolean-Searches-with-MySQL/
PHP MySQL Search And Order By Relevancy
You'll have to create 3 subqueries that do MATCH AGAINST, and them compile them together. You can do AGAINST("foobar") as rank so you'll have the score you needed.
This should look like:
SELECT *
FROM
(
SELECT id, 'hotel' as table_name, MATCH (search_field1) AGAINST ("lorem") as rank FROM tableA
UNION
SELECT id, 'food' as table_name, MATCH (search_field2) AGAINST ("lorem") as rank FROM tableB
) as res
ORDER BY res.rank DESC
if you are not using innodb table, and instead are using myisam, you can use mysql's built in full text search.
this works by first putting a full-text index on the columns you wish to search, and then creating a query that looks roughly like this:
SELECT *, MATCH(column_to_search) AGAINST($search_string) AS relevance
FROM your_table
WHERE MATCH(keywords) AGAINST($search_string IN BOOLEAN MODE)
ORDER BY relevance
LIMIT 20

Categories