PHP MySQL Search And Order By Relevancy - php

I know how to do a regular php mysql search and display the results. However, because of the nature of what I'm trying to accomplish I need to be able to sort by relevancy. Let me explain this better:
Normal Query "apple iphone applications" will search the database using %apple iphone application%, but if there aren't records which display that phrase in that exact order the search will produce nothing.
What I basically need to do is search for 'apple', 'iphone' and 'applications' all separately and then merge the results into one, and then I need to grade the relevancy by how many instances of the word are found in the records. For example if I did what I wanted to do and it returned them following:
Iphone Applications From Apple
Apple Make The Best Apple Iphone Applications
Iphone Applications
They would rank as follows:
Apple Make The Best Apple Iphone Applications
Iphone Applications From Apple
Iphone Applications
Because of how many instances of the search terms are found. See highlighted:
[Apple] Make The Best [Apple] [Iphone] [Applications]
[Iphone] [Applications] From [Apple]
[Iphone] [Applications]
I hope I have explained this well enough and I would be extremely grateful if anyone could give me any pointers.

take a look at the MySQL FULLTEXT search functions,
These should automatically return results by relevancy, and give you much more control over your searches
The only potential issue with using fulltext indexes is that they aren't supported by InnoDB tables.

Maybe this might help someone (order by relevance and keep word count):
None full-text:
SELECT *, ( (value_column LIKE '%rusten%') + (value_column LIKE '%dagen%') + (value_column LIKE '%bezoek%') + (value_column LIKE '%moeten%') ) as count_words
FROM data_table
WHERE (value_column LIKE '%dagen%' OR value_column LIKE '%rusten%' OR value_column LIKE '%bezoek%' OR value_column LIKE '%moeten%')
ORDER BY count_words DESC
Full-text:
SELECT * FROM data_table
WHERE MATCH(value_column) AGAINST('+dagen +rusten +bezoek +moeten' IN BOOLEAN MODE)
ORDER BY MATCH(value_column) AGAINST('+dagen +rusten +bezoek +moeten' IN BOOLEAN MODE) DESC;

A quick google gave me this link.
Example:
select title, match (title,content) against (”internet”) as score
from cont
where match (title,content) against (”internet”) limit 10;

SELECT field2, field3, ..., MATCH(field1, field2) AGAINST ("search string") AS relevance WHERE MATCH(field1, field2) AGAINST "search string" ORDER BY relevance DESC LIMIT 0,10
In the result set, there will be a field "relevance", which is used here to sort the results.

I Don't What exactly you want but the following code definitely work for you.
SELECT ("some text here" or `column_name`) RLIKE "Apple|Iphone|Application" AS Result ORDER BY Result DESC;
Separate all words with Bar(|) but results will be 1 or 0 founded or not resp.
If you want to get founded rows see below.
SELECT * FROM "table_name" WHERE `column_name` RLIKE "Apple|Iphone|Application";

Related

Select from 3 possible columns, order by occurances / relevance

I have a table that contains 3 text fields, and an ID one.
The table exists solely to get collection of ID's of posts based on relevance of a user search.
Problem is I lack the Einsteinian intellect necessary to warp the SQL continuum to get the desired results -
SELECT `id` FROM `wp_ss_images` WHERE `keywords` LIKE '%cute%' OR `title` LIKE '%cute%' OR `content` LIKE '%cute%'
Is this really enough to get a relevant-to-least-relevant list, or is there a better way?
Minding of course databases could be up to 20k rows, I want to keep it efficient.
Here is an update - I've gone the fulltext route -
EXAMPLE:
SELECT `id` FROM `wp_ss_images` WHERE MATCH (`keywords`,`title`,`content`) AGAINST ('+cute +dog' IN BOOLEAN MODE);
However it seems to be just grabbing all entries with any of the words. How can I refine this to show relevance by occurances?
To get a list of results based on the relevance of the number of occurrences of keywords in each field (meaning cute appears in all three fields first, then in 2 of the fields, etc.), you could do something like this:
SELECT id
FROM (
SELECT id,
(keywords LIKE '%cute%') + (title LIKE '%cute%') + (content LIKE '%cute%') total
FROM wp_ss_images
) t
WHERE total > 0
ORDER BY total DESC
SQL Fiddle Demo
You could concatenate the fields which will be better than searching them individually
SELECT `id` FROM `wp_ss_images` WHERE CONCAT(`keywords`,`title`,`content`) LIKE '%cute%'
This doesn't help with the 'greatest to least' part of your question though.

How to implement a Search Algorithm

This is the first time I am writing an actual search feature for my database.
The database consists of hotel names, hotel food items, hotel locations.
I would like the above three to show up during a search of a string.
Are there any common search algorithm or packages that can be used ?
EXPECTED RESULT SET:
id | name | description | table_name | rank
56 | KFC| Fried chicken | hotel | 1
12 | [food item name] | [food item description] | food_item | 2
19 | [hotel name] | [hotel description] | hotel | 3
....
Do you mean a relational database? If yes, your "search" algorithm is a WHERE clause.
Do you mean contextual search? Lucene is a great search engine implementation written in Java. This might help you marry it with Lucene:
http://www.cabotsolutions.com/2009/05/using-solr-lucene-for-full-text-search-with-mysql-db/
The answer is far more complicated if you're thinking about crawling web sites based on some criteria. Please clarify.
If you are using Microsoft SQL Server, FreeText works very well:
http://msdn.microsoft.com/en-us/library/ms176078.aspx
Let's consider you're using mysql.
Well your question is basically: how to write a query that will search hotel name, food items, and hotel location.
I guess theses 3 informations are stored in 3 different tables. The easiest way would be to simply query the 3 tables one after the other with query like theses:
SELECT * FROM hotel WHERE hotel_name LIKE "%foobar%";
SELECT * FROM hotel_food_item WHERE item_name LIKE "%foobar%";
SELECT * FROM hotel_location WHERE hotel_name LIKE "%foobar%" OR street_name LIKE "%foobar%" OR city LIKE "%foobar%";
Make sure your search term are safe from SQL injection
You may (or not) want to group the query into 1 bigger query
If your database is becoming large ( like < 100 000 line per table ), or if you have a lot or search query, you might be interested in creating a search index, or use a dedicated database intend for text search, like elastic search or something else.
Edit:
If relevance is a matter, use MATCH AGAINST:
http://maisonbisson.com/blog/post/10752/making-mysql-do-relevance-ranked-full-text-searches/
http://www.devshed.com/c/a/PHP/Using-Relevance-Rankings-for-Full-Text-and-Boolean-Searches-with-MySQL/
PHP MySQL Search And Order By Relevancy
You'll have to create 3 subqueries that do MATCH AGAINST, and them compile them together. You can do AGAINST("foobar") as rank so you'll have the score you needed.
This should look like:
SELECT *
FROM
(
SELECT id, 'hotel' as table_name, MATCH (search_field1) AGAINST ("lorem") as rank FROM tableA
UNION
SELECT id, 'food' as table_name, MATCH (search_field2) AGAINST ("lorem") as rank FROM tableB
) as res
ORDER BY res.rank DESC
if you are not using innodb table, and instead are using myisam, you can use mysql's built in full text search.
this works by first putting a full-text index on the columns you wish to search, and then creating a query that looks roughly like this:
SELECT *, MATCH(column_to_search) AGAINST($search_string) AS relevance
FROM your_table
WHERE MATCH(keywords) AGAINST($search_string IN BOOLEAN MODE)
ORDER BY relevance
LIMIT 20

Optimizing auto-complete FULLTEXT SQL query

I have the following query which is used in order to do an auto-complete of a search box:
SELECT *, MATCH (screen_name, name) AGAINST ('+query*' IN BOOLEAN MODE) AS SCORE
FROM users
WHERE MATCH (screen_name, name) AGAINST ('+query*' IN BOOLEAN MODE)
ORDER BY SCORE DESC LIMIT 3
I also have a FULL TEXT index on screen_name & name (together). When this table was relatively small (50k) this worked great. Now the table is ~200k and it takes seconds(!) to complete each query. I'm using MySql MyISAM. Is this reasonable? What directions might I check in order to improve this as surely it doesn't satisfy the needs of an auto-complete query.
MYSQL Match against is really slow, you should look into alternatives like Sphinx Search Server.

I'm not getting the expected result from an SQL query

I'm developing a search function for a website. I have a table called keywords with two fields id and keyword. I have two separate search queries for AND and OR. The problem is with the AND query. It is not returning the result that I expect.
The printed SQL is :
SELECT COUNT(DISTINCT tg_id)
FROM tg_keywords
WHERE tg_keyword='keyword_1'
AND tg_keyword='keyword_2'
The count returned is 0, while if I perform the same SQL with OR instead of AND the count returned is 1. I expected the count to be 1 in both cases, and I need it to be this way as the AND results will take priority over the OR results.
Any advice will be much appreciated.
Thanks
Archie
It will always return 0, unless keyword_1=keyword_2. tg_keyword can only have one value, and when you say AND, you're asking for both conditions to be true.
It's the same, logically speaking, as asking "How many friends do I have whose name is 'JACK' and 'JILL'"? None, nobody is called both JACK and JILL.
I don't know what your table looks like and how things are related to each other, but this query makes no sense. You're returning rows where the keyword is one thing and another thing at the same time? That's impossible.
You probably have another table that links to the keywords? You should search with that, using a join, and search for both keywords. We could give you a more precise answer if you could tell us what your tables look like.
EDIT: Based on what you wrote in a comment below (please edit your question!!), you're probably looking for this:
SELECT COUNT(DISTINCT tg_id)
FROM tg_keywords AS kw1, tg_keywords AS kw2
WHERE kw1.tg_id = kw2.tg_id
AND kw1.tg_keyword='keyword_1'
AND kw2.tg_keyword='keyword_2'
your query can't work because you have a condition which is always false so no record will be selected!
tg_keyword='keyword_1' AND tg_keyword='keyword_2'
what are you trying to do? Could you post the columns of this table?
tg_keyword='keyword_1' AND tg_keyword='keyword_2'
Logically this cannot be true, ever. It cannot be both. Did you mean something like:
SELECT * FROM keywords
WHERE tg_keyword LIKE '%keyword_1%' OR tg_keyword LIKE '%keyword_2%'
ORDER BY tg_keyword LIKE '%keyword_1%' + tg_keyword LIKE '%keyword_2%' DESC;
Based on the OP's clarification:
I have a table with multiple keywords with the same id. How can I get more than one keyword compared for the same id, as the search results need to be based on how many keywords from a search array match keywords in the keywords table from each unique id. Any ideas?
I assume you're looking to return search results based on a ranking of how many of the selected keywords are a match with those results? In other words, is the ID field that multiple keywords share the ID of a potential search result?
If so, assuming you pass in an array of keywords of the form {k1, k2, k3, k4}, you might use a query like this:
SELECT ID, COUNT(ID) AS ResultRank FROM tg_keywords WHERE tg_keyword IN (k1, k2, k3, k4) GROUP BY ID ORDER BY ResultRank DESC
This example also assumes a given keyword might appear in the tables multiple times with different IDs (because a keyword might apply to multiple search results). The query will return a list of IDs in descending order based on the number of times they appear with any of the selected keywords. In the given example, the highest rank for a given ID should be 4, meaning ALL keywords apply to the result with that ID...
I think you will need to join tg_keywords to itself. Try playing around with something like
select *
from tg_keywords k1
join tg_keywords k2 on k1.tg_id = k2.tg_id
where k1.tg_keyword = 'keyword_1' and k2.tg_keyword = 'keyword_2'
Try:
SELECT tg_id
FROM tg_keywords
WHERE tg_keyword in ('keyword_1','keyword_2')
GROUP BY tg_id
HAVING COUNT(DISTINCT tg_keyword) = 2

Complicated SQL query - how do i do it?

I'm currently working on a small feature for a project and wondered how best to achieve the result, i have a table full of reviews with the score being out of 5 and being stored as score in the database, on one page i want to show how many of each score reviews there are, ie number of 5 star reviews, 4 star etc
But i don't know how best to achieve surely i don't need 5 different queries, that would be awful design would it not ?
Thanks and hope you can help !
Since I do not have your table structure, I would do something similar to this (with appropriate names replaced)
edited SQL based on comments
Select Score, COUNT (*) as NumScores
From MyTableOfScores
Group By Score
Order by Score Desc
You need something like this:
select score, count(*) from reviews group by score

Categories