I am implementing a search feature for my project. I am using a FULL TEXT SEARCH query to derive accurate results to User. I am beginner in PHP programming and I do not have enough information about FULL TEXT SEARCH.
This is my query:
$sql = $conn->prepare("SELECT *, MATCH(title, keyword) AGAINST(? IN BOOLEAN MODE) AS relevance FROM table ORDER BY relevance DESC LIMIT 20");
$sql->bind_param("s", $q);
$sql->execute();
$rs = $sql->get_result();
This query works good but this is only showing old results first instead of accurate results, and second thing is this query is not working correctly when the length of keyword is not more than 1 (e.g. keyword = Google).
Please do not give suggestions about Elastic search, Sphinx,
Algolia etc.
When MATCH() is used in a WHERE clause, the rows returned are automatically sorted with the highest relevance first.
So all you have to do is, remove the match from select and put it in where condition.
Source: https://dev.mysql.com/doc/refman/8.0/en/fulltext-natural-language.html
Why are you not using the sql like operator, I am providing you the example for multiple words in column named product in table named products
$db=mysqli_connect('localhost','root','','project');
$search=$_GET['userinput'];
$searcharray = explode(' ', $search);
$searchpdo=array();
$finalstate="";
foreach ( $searcharray as $ind=> $query){
$sql=array();
$exp='%'.$query.'%';
array_push($sql,"(title LIKE ?)");
array_push($searchpdo,$exp);
array_push($sql,"(keywords LIKE ?)");
array_push($searchpdo,$exp);
if($finalstate==""){
$finalstate = "(".implode(" OR ",$sql).")";
}
else{
$finalstate = $finalstate." AND "."(".implode(" OR ",$sql).")";
}
}
$stmt = $db->prepare("SELECT * FROM products WHERE (".$finalstate.") ");
$types=str_repeat('s',count($searchpdo));
$stmt->bind_param($types,...$searchpdo);
$stmt->execute();
$result = $stmt->get_result();
This will provide you the correct result with single word or multiple words
I think you have to tweak you query little bit and you would get desired results as under:
$sql = mysql_query("SELECT * FROM
patient_db WHERE
MATCH ( Name, id_number )
AGAINST ('+firstWord +SecondWord +ThirdWord' IN BOOLEAN MODE);");
and if you want to do exact search:
$sql = mysql_query("SELECT *
FROM patient_db
WHERE MATCH ( Name, id_number )
AGAINST ('"Exact phrase/Words"' IN BOOLEAN MODE);");
I had also posted the same answer in SO post somewhere but didn't know the post
There are multiple aspect to your question
If available, use mysql client to run the query instead of PHP first, until your query is ready to the like you want
If you recent documents (record) to show up on top of the search result, you need to change your ORDER BY clause. Currently, it is supposed to return the closest match (i.e. by relevance).
You need to strike a balance between relevance and recency (not clear how you define this) in your custom logic. A simple example that prioritize last week over last month and last month over the rest:
SELECT
....
, DATEDIFF (ItemDate, CURDATE() ) ItemAgeInDays
ORDER BY
relevance
* 100
* CASE
WHEN ItemAgeInDays BETWEEN 0 AND 7 --- last week
THEN 20
WHEN ItemAgeInDays BETWEEN 0 AND 30 --- last month
THEN 10
ELSE 1
END
DESC
You say single word item cannot be searched. In BOOLEAN MODE, you build a boolean logic for your search and such it uses special characters for that. For example +apple means 'apple' must exist. It is possible your single word might be conflicting with these characters.
Please review this reference, it explains the BOOLEAN MODE in great detail.
https://dev.mysql.com/doc/refman/8.0/en/fulltext-boolean.html
You say the query is not returning correct result. FULL TEXT search searches for your login in each document(row) and finds how many times it appears in each document. It then offset that by how many times your search appears in ALL documents. This means it prioritizes records where your search appears much more than the average. If your search is not distinguishing enough, it might seem not correct if most documents are similar to each in terms of that search. See the above link for more details.
BOOLEAN MODE does not sort the result by relevance by default. You need to add ORDER BY yourself, which you already did. Just wanted to note it here for others
Related
Introduction
I have a project where we help the IT community to organize better IT meetups (just like meetup.com but only for IT community) called https://codotto.com/
We are developing a new feature where we would like to show the number of usages of certain tags. The algorithm should work just like Stackoverflow's one. You write javascript and you get a list with the tags that match your query sorted by the most used ones.
I'm currently using Laravel but I will post the raw query so that it's easier for mysql wizards to help me out if possible :)
Problem
I have the following tables
tags table
id name
1 javascript
2 javascript-tools
3 javascript-security
group_has_tags table
group_id tag_id
1 2
2 2
2 3
We have the tags javascript-tools being used two times, javascript-security is used one while javascript is not used at all.
Now, if a user search for javascript, he should get first javascript (because it is a direct match) followed by the rest of the tags sorted by their usage.
In Laravel this is the code that I have (simplified ofc)
$tags = Tag::withCount('groups')
->orderBy('groups_count', 'DESC')
->where('name', 'LIKE', 'javascript%')
->get(2);
The problem is that since I only return back 2 results, javascript is not being included in the results, even tho it's what the user literally wrote
For the mysql magicians, here is the raw query
select
`tags`.*,
(
select
count(*)
from
`meetups`
inner join `meetup_has_tags` on `meetups`.`id` = `meetup_has_tags`.`meetup_id`
where
`tags`.`id` = `meetup_has_tags`.`tag_id`
) as `meetups_count`
from
`tags`
where
`title` LIKE 'javascript%'
order by
`meetups_count` desc
limit
2 offset 0
Question
The main objective here is to return the most relevant result to the user. He writes javascript and javascript shows up first followed by less "relevant" results. The way I found was to sort by the number of times a tag was used.
Is there a solution where I can do "please, fetch the results that match this query first, then return the most relevant results"?
By "most relevant" results, I simply mean "what the user is looking for". If he writes "javascript" it should return "javascript" followed by "javascript-tools" (because "javascript-tools" was used twice but the user is literally searching for "javascript")
Here is your query:
SELECT * FROM
(SELECT tags.*,COALESCE((SELECT COUNT(*) FROM group_has_tags WHERE tag_id = tags.id),0) AS usage
FROM tags
WHERE title LIKE 'javascript%') AS tmp
ORDER BY tmp.name = 'javascript' DESC,usage DESC
For each matching tag you get the number of times it has been used.
Then you first sort by whether the tag matches literally what the user has typed, then by the usage.
Of course you will have to parameterize this query but I hope you get the idea.
I am trying to do a search on my MySQL database to get the row that contains the most similar value to the one searched for.
Even if the closest result is very different, I'd still like to return it (Later on I do a string comparison and add the 'unknown' into the learning pool)
I would like to search my table 'responses' via the 'msg1' column and get one result, the one with the lowest levenshtein score, as in the one that is the most similar out of the whole column.
This sort of thing:
SELECT * FROM people WHERE levenshtein('$message', 'msg1') ORDER BY ??? LIMIT 1
I don't quite grasp the concept of levenshtein here, as you can see I am searching the whole table, sorting it by ??? (the function's score?) and then limiting it to one result.
I'd then like to set $reply to the value in column "reply" from this singular row that I get.
Help would be greatly appreciated, I can't find many examples of what I'm looking for. I may be doing this completely wrong, I'm not sure.
Thank you!
You would do:
SELECT p.*
FROM people p
ORDER BY levenshtein('$message', msg1) ASC
LIMIT 1;
If you want a threshold (to limit the number of rows for sorting, then use a WHERE clause. Otherwise, you just need ORDER BY.
Try this
SELECT * FROM people WHERE levenshtein('$message', 'msg1') <= 0
Since I'm moving to sphinx search engine to improove my ebsite performance I'm trying to translate the old mysql queries to new sphinx language.
The point is to sort results based on a math operation between votes to my posts and the points given for each vote (going from 1 to 5).
So for example, if i got 3 votes for a post and I got vote 1=5points vote 2=3points and vote 3=2points, my table will contain a field named votes with an integer = 3 (votes=3) and a field with an integer of 5+3+2 (points=10).
Due to this the final rating for such post will be points/votes, in this example it will be 10/3=3,333...
Assuming I'm using the sphinx api to get a list of top rated posts in DESCENDING order, this is the old mysql query i had on my php script:
mysql_query("SELECT * FROM table ORDER BY points/votes DESC LIMIT $start,$stop");
I tried to build a sphinx query, but it is not working and always giving 0 results. Please read tall the // commented lines that describe all the tries I did.
require("sphinxapi.php");
$cl = new SphinxClient;
$index = index;
$cl->setServer("localhost", 9312);
$cl->SetMatchMode(SPH_MATCH_FULLSCAN);
//$cl->SetSortMode(SPH_SORT_EXTENDED, 'IDIV(points,votes) DESC'); //not working
//$cl->SetSortMode(SPH_SORT_EXTENDED, '(points DIV votes) DESC'); //not working
//$cl->SetSortMode(SPH_SORT_EXTENDED, 'points/votes DESC'); //not working
//$cl->SetSortMode(SPH_SORT_EXTENDED, '(points/votes) DESC'); //not working
$cl->setLimits($start,$stop,$max_matches=1000);
$query = "";
Would you please help me out finding what's wrong... thanks.
You will need to use SPH_SORT_EXPR
$cl->SetSortMode(SPH_SORT_EXPR, '(points/votes) DESC');
Firstly you need points and votes to be Attributes, NOT fields. Attributes are stored in the index, can be used for sorting etc. Arithmetic can only be performed on numeric attributes (not strings)
The correct syntax for SPH_SORT_EXPR (assuming you've already got the attributes) would be
$cl->SetSortMode(SPH_SORT_EXPR, 'points/votes');
SPH_SORT_EXPR is ALWAYS descending, so you dont need it DESC on the end.
But rather than have sphinx calculate that ratio every single time, you would porbbaly be better calculating during sql_query and storing it as single number attribute. TIP: store as an integer, not float. Integers are more efficient to sort by.
I have an ajax script that searches database tables for expressions similar to google search. The SELECT statement just uses LIKE and finds matches in the relevant fields. It worked fine at first but as content has grown, it is giving way too many matches for most search strings.
For example, if you search for att, you get att but also attention, attaboy, buratta etc.
Good search engines such as Google seem to have an intermediate table of suggestions that have been vetted by others. Rather than search the data directly, they seem to search the approved phrases such as AT&T and succeed in narrowing the number of results. Has anyone coded something like this and suggest the right dbase schema and query to get relevant results.
Right now I am searching table of say names directly with something like
$sql = "SELECT lastname from people WHERE lastname LIKE '%$searchstring%'";
I imagine besides people I should create some intermediate table along the lines of
people
id|firstname|lastname|description
niceterms
id|niceterm|peopleid
Then the query could be:
$sql = "SELECT p.lastname,p.peopleid, n.niceterm, n.peopleid,
FROM `people` p
LEFT JOIN `niceterms` n
on p.id = n.peopleid
WHERE niceterm LIKE '%$searchterm%'";
..so when you type something in the search box, you get nice search terms that will yield better results.
But how do I populate the niceterms table. Is this the right approach? I'm not trying to create a whole backweb or pagerank. Just want to narrow search results so they are relevant.
Thanks for any suggestions.
You might want to take a look at FULLTEXT search in Mysql. It allowes you to create powerfull query's based on relevance. You can for example create a BOOLEAN search which allowes you to create a scorerow in your result. The score will be based on rules like does the text start with a karakter combination (yes? +2, no but it does contain the combination: +1)
The below code is just another column and it has 3 rules in it:
Does the p1.name field contain Bl or rock? if yes -> add score
Does the p1.name field start with either Bl or rock? if yes -> add score
IS the p1.name equal to Bl rock? if yes -> add score
MATCH p1.name AGAINST('>Bl* >rock* >((+Bl*) (+rock*)) >("Bl rock")' IN BOOLEAN MODE) AS match
Now just order by match and it will show you the most relevant searches. You can also combine the order by with multiple statements and add a limit like below:
Orders by most recent date, highest match and then orders the matches that have the same score by their character length
ORDER BY `date` DESC, `match` DESC, LENGTH(`p1`.`name`) ASC
Keep in mind that the above code somehow creates a relevant result based on common cases. Copying Google will be imposible since their algorithms for optimal results / speed are incredible.
If FULLTEXT search is a step to much, try to make a tag system. Tagging content with unique tag combinations will also result in a more reliable search result
I have a simple search feature that will search a table for a hit. Basically the one flaw of this search is that it tries to match exactly the string. I want it to take a string of say "large red bird" and search "large", "red" and "bird" separately against the table. here's my search query...
$result = mysql_query("SELECT * FROM files
WHERE (tags LIKE '%$search_ar%' OR
name LIKE '%$search_ar%' OR
company LIKE '%$search_ar%' OR
brand LIKE '%$search_ar%')
$str_thera $str_global $str_branded $str_medium $str_files");
any Ideas? thanks
Edit
OK here's my updated query but it doesnt return anything.
$result = mysql_query("SELECT * FROM files
WHERE MATCH(tags, name, company, brand)
AGAINST ('$seach_ar' IN boolean MODE)");
A fulltext search will indeed help, given you do have a lot of data to search by. A good reference for the full text can be found here: http://forge.mysql.com/w/images/c/c5/Fulltext.pdf
The reason I say a lot of data, or a fair amount, is that if the search results yields above a certain percentage of returned rows to total rows, nothing is returned.
If you want to continue using the LIKE method, you can. You just have to seperate the words (explode) and the join them in the sql query using AND:
...(tags LIKE '%$search_ar[0]%' AND tags LIKE '%$search_ar[1]%') OR ....
In a fashion like that. This method can get overly complicated especially if say, you want to return matches which has any of the words and not all of them. So yea, it will take some customization to do and to automate, but it is possible.
If, for whatever reason, you can't use full text searching (nice idea, Nick and Brad, but only MyISAM supports it, and from what I hear, it's not really all that good), it's not too hard to rig up basic searching with explode() and implode():
$search_arr = explode(' ',$search_str);
$search_str = implode('%',$search_arr);
//query where fields like $search_arr
Or:
$search_arr = explode(' ', $search_str);
$sql = 'select * from files where ';
foreach($search_arr = $term){
$sql .= "tags like '%{$term}%' or " //fill in rest of fields
}
As a side note: If you plan on growing your searching and can't use MySQL's full text search, I recommend checking out one of the search servers, such as Solr or Lucene.