Php/ MySql 'Advanced Search' Page - php

I'm working on an 'advanced search' page on a site where you would enter a keyword such as 'I like apples' and it can search the database using the following options:
Find : With all the words, With the
exact phrase , With at least one of
the words, Without the words
I can take care of the 'Exact phrase' by:
SELECT * FROM myTable WHERE field='$keyword';
'At least one of the words' by:
SELECT * FROM myTable WHERE field LIKE '%$keyword%';//Let me know if this is the wrong approach
But its the 'With at least one of the words' and 'Without the words' that I'm stuck on.
Any suggestions on how to implement these two?
Edit: Regarding 'At least one word' it wouldn't be a good approach to use explode() to break the keywords into words, and run a loop to add
(field='$keywords') OR ($field='$keywords) (OR)....
Because there are some other AND/OR clauses in the query also and I'm not aware of the maximum number of clauses there can be.

I would suggest the use of MySQL FullText Search using this with the Boolean Full-Text Searches functionality you should be able to get your desired result.
Edit:
Requested example based on your requested conditions ("Its just one field and they can pick either of the 4 options (i.e 1 word, exact words, at least 1 word, without the term).")
I am assuming you are using php based on your initial post
<?php
$choice = $_POST['choice'];
$query = $_POST['query'];
if ($choice == "oneWord") {
//Not 100% sure what you mean by one word but this is the simplest form
//This assumes $query = a single word
$result = mysql_query("SELECT * FROM table WHERE MATCH (field) AGAINST ('{$query}' IN BOOLEAN MODE)");
} elseif ($choice == "exactWords") {
$result = mysql_query("SELECT * FROM table WHERE MATCH (field) AGAINST ('\"{$query}\"' IN BOOLEAN MODE)");
} elseif ($choice == "atLeastOneWord") {
//The default with no operators if given multiple words will return rows that contains at least one of the words
$result = mysql_query("SELECT * FROM table WHERE MATCH (field) AGAINST ('{$query}' IN BOOLEAN MODE)");
} elseif ($choice == "withoutTheTerm") {
$result = mysql_query("SELECT * FROM table WHERE MATCH (field) AGAINST ('-{$query}' IN BOOLEAN MODE)");
}
?>
hope this helps for full use of the operators in boolean matches see Boolean Full-Text Searches

You could use
With at least one of the words
SELECT * FROM myTable WHERE field LIKE '%$keyword%'
or field LIKE '%$keyword2%'
or field LIKE '%$keyword3%';
Without the word
SELECT * FROM myTable WHERE field NOT LIKE '%$keyword%';

I'm not sure you could easily do those search options in a naive manner as the other two.
It would be worth your while implementing a better search engine if you need to support those scenarios. A simple one that could probably get you by is something along these lines:
When an item is added to the database, it is split up into the individual words. At this point "common" words (the, a, etc...) are removed (probably based on a common_words table). The remaining words are added to a words table if they are not already present. There is then a link made between the word entry and the item entry.
When searching, it is then a case of getting the word ids from the word table and the appropriate lookup of item ids in the joining table.

Search is notoriously difficult to do well.
You should Consider using a third party search engine using something like Lucene or Sphider.

Giraffe and Re0sless pooseted 2 good answers.
notes:
"SELECT * " sucks... only select the columns that you need.
Re0sless puts a "OR" between keywords.
- you should eliminate common words (" ","i","am","and"..etc)
- mysql has a 8kb i belive limit on the size of the query, so for really long SELECTS you should slipt it into separate queries.
- try to eliminate duplicate keywords (if i search for "you know you like it" the SELECT should basically only search for "you" once and elimnate common words as "it")
Also try to use "LIKE" and "MATCH LIKE" (see mysql man page) it could do wonders for "fuzzy" searches

Related

Mysql FullText index searching issue

I am using MySql FullText indexing to search data from database.
Here is the query
$search_input_text = 'the_string_to_be_search';
$searchArray = explode(" ", $search_input_text);
$query="SELECT * FROM car_details
WHERE MATCH (car_trim) AGAINST ('";
foreach ($searchArray as $word) {
$query .= "+".$word."* ";
}
$query .= "' IN BOOLEAN MODE) LIMIT $start, $limit";
The query is executing fine but it has a bug, if you look at the column name you will find car_trim which is inside the MATCH() function. The column has only 3 different types of values in the database which are 'T5', 'T6' and 'T5 premier'.
When I type 'Premier' in the search bar and hit Enter, it fetches the results whose values contain the word 'Premier'. But when I type T5 or T6 , it returns an empty record. Please be sure that there are lots of records with car_trim='T5', car_trim='T6' or car_trim='T5 Premier'
I am not getting that what can be the problem with the strings T5 and T6.
MySQL has two key parameters when using full text search (and a few other important ones). The key parameters are the minimum word length and the stop words list. In short, MySQL ignores words that are less than 3 or 4 characters (depending on the storage engine) or that are in the stop word list.
Your examples ("T5" and "T6") are too short -- based on the parameter defaults.
Some other configuration parameters might be of interest, such as the maximum word length and the characters that are valid for words.
You can change the parameters for full text indexing and re-build the index.
Here is a good place to start in understanding this.

MySQL select only using first word of variable

I am using php and mySQL. I have a select query that is not working. My code is:
$bookquery = "SELECT * FROM my_books WHERE book_title = '$book' OR book_title_short = '$book' OR book_title_long = '$book' OR book_id = '$book'";
The code searches several title types and returns the desired reference most of the time, except when the name of the book starts with a numeral. Though rare, some of my book titles are in the form "2 Book". In such cases, the query only looks at the "2", assumes it is a "book_id" and returns the second entry in the database, instead of the entry for "2 Book". Something like "3 Book" returns the third entry and so forth. I am confused why the select is acting this way, but more importantly, I do not know how to fix it.
If you have a column in your table with a numeric data type (INT, maybe), then your search strategy is going to work strangely for values of $book that start with numbers. You have discovered this.
The following expression always returns true in SQL. It's not intuitive, but it's true.
99 = '99 Luftballon'
That's because, when you compare an integer to a string, MySQL implicitly does this:
CAST(stringvalue AS INT)
And, a cast of a string beginning with the text of an integer always returns the value of the integer. For example, the value of
CAST('99 Luftballon' AS INT)
is 99. So you'll get book id 99 if you look for that search term.
It's pointless to try to compare an INT column to a text string that doesn't start with an integer, because CAST('blah blah blah' AS INT) always returns zero. To make your search strategy work better, you should consider omitting OR book_id = '$book' from your search query unless you know that the entirety of $book is a number.
As others mention, my PHP allowed both numerical enties and text entries from the browser. My query was then having a hard time with this, interpreting some of my text entries as numbers by truncating the end. Thus, my "2 Book" was being interpreted as the number "2" and then being queried to find the second book in the database. To fix this I just created a simple if statement in PHP so that my queries only looked for text or numbers. Thus, in my case, my solution was:
if(is_numeric($book)){
$bookquery = "SELECT * FROM books WHERE book_id = '$book'";
}else{
$bookquery = "SELECT * FROM books WHERE book_title = '$book' OR book_title_short = '$book' OR book_title_long = '$book'";
}
This is working great and I am on my way coding happily again. Thanks #OllieJones and others for your questions and ideas which helped me see I needed to approach the problem differently.
Not sure if this is the correct answer for you but it seems like you are searching for only exact values in your select. Have you thought of trying a more generic search for your criteria? Such as...
$bookquery = "SELECT * FROM my_books WHERE book_title LIKE '".$book."' OR book_title_short LIKE '".$book."' OR book_title_long LIKE '".$book."' OR book_id LIKE '".$book."'"
If you are doing some kind of searching you might even want to ensure the characters before the search key are found as well like so....
$bookquery = "SELECT * FROM my_books WHERE book_title LIKE '%".$book."' OR book_title_short LIKE '%".$book."' OR book_title_long LIKE '%".$book."' OR book_id LIKE '%".$book."'"
The % is a special char that looks for allows you to search for the chars you want to search for PLUS any characters before this that aren't in the search criteri... for example $book = "any" with a % before hand in the query like so, '%".$book."'"`` would return bothcompanyand also the wordany` by itself.
If you need to you can add a % to the end also like so, `'%".$book."%'"`` and it would do the same for the beginning and end of the search key

MATCH ... AGAINST doesn't come up with exact results

I'm working for the first time with MATCH...AGAINST in php sql but there is one bothering me and I can't figure out how to fix it. This is my code:
SELECT * FROM m_artist WHERE match(artist_name) against('". $_POST['article_content'] ."' IN BOOLEAN MODE)
And this is $_POST['article_content']:
Wildstylez Brothers Yeah Frontliner Waveliner
Now my output should be: Wildstylez, Frontliner and Waveliner cause that's in my database. And I do but besides that I also get the Vodka Brothers, 2 Brothers of Hardstyle and more cause of the word brothers. How do I fix that SQL only selects the literal match?
Full-text search actually is a quite misleading name: you can search the full text by your query (like google does) but it won't guarantee you, that the full text equals your query.
So, according to documentation on Boolean Full-Text Searches your input Wildstylez Brothers Yeah Frontliner Waveliner is interpreted as artist_name contains (at least) one of Wildstylez, Brothers, Yeah, Frontliner and Waveliner as word. This is why you get e.g. the Vodka Brothers, which contains Brothers. For google-like purposes this is just what you want, as you want to get details on something you only know part of as in show me articles on music.
You probably want to use
artist_name LIKE '%name_part1%' OR artist_name LIKE '%name_part2%' ...
or
artist_name IN ('exact_name1', 'exact_name2', ...)
simpliest case would be doing something like
$names = explode(' ', $_POST['article_content']);
$name_searches = array_map(function($a) {return 'artist_name = \''.mysql_real_escape_string($a).'\'';}, $names);
$sql = "SELECT * FROM m_artist WHERE ".implode(" OR ", $name_searches);
but you would loose the ability to find 2 Brothers of Hardstyle as the name itself contains a space.
Another approach can be to prefix all words by '+' and stick to MATCH() AGAINST() and you will find only artists which include every word given.
Please provide more context if this is not what you are looking for.

How do I search Full text with partial matches?

I have a table, with not many rows, and neither many columns. I am doing a Full text search on 3 columns.
My code is
$search_input = trim($_GET['s']);
$search = mysql_real_escape_string($search_input);
$search = '+'.str_replace(' ', '* +', $search).'*';
$sql = "SELECT * FROM table WHERE
MATCH(def, pqr, xyz) AGAINST ('$search' IN BOOLEAN MODE)";
$result = mysql_query($sql);
I can correctly search for terms like abcdefgh, which are present as ... abcdefgh ....
But I am receiving empty set with search terms like abc, where in table entry is present something like abc-123, and also terms like abcdefghs. (notice this is plural of above)
Clearly I need to implement partial search, or something like that.
But how do I implement such a search? Any better way to do a entire table search on user input?
Do mention anything I am doing incorrectly.
EDIT : By adding * after each word, now I am able to also search for abcde, but above problems remains.
Do you mean you don't get results for 3 letter combinations? If so, you might be hitting the mysql index length (which is usually set to 3)
More info here - http://dev.mysql.com/doc/refman/5.1/en/fulltext-fine-tuning.html

PHP mysql search queries

I'm trying to create a search engine for an inventory based site. The issue is that I have information inside bbtags (like in [b]test[/b] sentence, the test should be valued at 3, whereas sentence should be valued at 1).
Here is an example of an index:
My test sentence, my my (has a SKU of TST-DFS)
The Database:
|Product| word |relevancy|
| 1 | my | 3 |
| 1 | test | 1 |
| 1 |sentence| 1 |
| 1 | TST-DFS| 10 |
But how would I match TST-DFS if the user typed in TST DFS? I would like that SKU to have a relevancy of say 8, instead of the full 10..
I have heard that the FULL TEXT search feature in MySQL would help, but I can't seem to find a good way to do it. I would like to avoid things like UNIONS, and to keep the query as optimized as possible.
Any help with coming up with a good system for this would be great.
Thanks,
Max
But how would I match TST-DFS if the user typed in TST DFS?
I would like that SKU to have a relevancy of say 8, instead of the full 10..
If I got the question right, the answer is actually easy.
Well, if you forge your query a little before sending it to mysql.
Ok, let's say we have $query and it contains TST-DFS.
Are we gonna focus on word spans?
I suppose we should, as most search engines do, so:
$ok=preg_match_all('#\w+#',$query,$m);
Now if that pattern matched... $m[0] contains the list of words in $query.
This can be fine-tuned to your SKU, but matching against full words in a AND fashion is pretty much what the user presumes is happening. (as it happens over google and yahoo)
Then we need to cook a $expr expression that will be injected into our final query.
if(!$ok) { // the search string is non-alphanumeric
$expr="false";
} else { // the search contains words that are no in $m[0]
$expr='';
foreach($m[0] as $word) {
if($expr)
$expr.=" AND "; // put an AND inbetween "LIKE" subexpressions
$s_word=addslashes($word); // I put a s_ to remind me the variable
// is safe to include in a SQL statement, that's me
$expr.="word LIKE '%$s_word%'";
}
}
Now $expr should look like "words LIKE '%TST%' AND words LIKE '%DFS%'"
With that value, we can build the final query:
$s_expr="($expr)";
$s_query=addslashes($query);
$s_fullquery=
"SELECT (Product,word,if((word LIKE '$s_query'),relevancy,relevancy-2) as relevancy) ".
"FROM some_index ".
"WHERE word LIKE '$s_query' OR $s_expr";
Which shall read, for "TST-DFS":
SELECT (Product,word,if((word LIKE 'TST-DFS'),relevancy,relevancy-2) as relevancy)
FROM some_index
WHERE word LIKE 'TST-DFS' OR (word LIKE '%TST%' AND word LIKE '%DFS%')
As you can see, in the first SELECT line, if the match is partial, mysql will return relevancy-2
In the third one, the WHERE clause, if the full match fails, $s_expr, the partial match query we cooked in advance, is tried instead.
I like to lower case everything and strip out special characters (like in a phone number or credit card I take everything out on both sides that isn't a number)
Rather than try to create your own FTS solution, you could try to fit the MySQL FTS engine to your requirements. What I've seen done is create a new table to store your FTS data. Create a column for each different piece of data that you want to have a different relevance. For your sku field you could store the raw sku, with spaces, underscores, hyphens and any other special character intact. Then store a stripped down version with all these things removed. You may also want to store a version with leading zeros removed, as people often leave things like that out. You can store all these variations in the same column. Store your product name in another column, and the product description in another column. Create a separate index on each column. Then when you do your search, you can search each column individually, and multiply the rank of the results based on how important you think that column is. So you could multiply sku results by 10, title by 5 and leave description results as is. You may have to do a little experimentation to get the results you want, but it may ultimately be simpler than creating your own index.
Create a keywords table. Something along the lines of:
integer keywordId (autoincrement) | varchar keyword | int pointValue
Assign all possible keywords, skus, etc, into this table. Create another table, a post-keywords bridge, (assuming postId is the id you've assigned in your original table) along the lines of:
integer keywordId | integer postId
Once you have this, you can easily add keywords to each post as it is interested. To calculate total point value for a given post, a query such as the following should do the trick:
SELECT sum(pointValue) FROM keywordPostsBridge kpb
JOIN keywords k ON k.keywordId = kpb.keywordId
WHERE kpb.postId = YOUR_INTENDED_POST
I think the solution is quite straightforward unless I missed something.
Basically run two search, one is exact match, the other is like match or regex match.
Join two resultsets together, like match left join exact match. Then for example:
final_relevancy = (IFNULL(like_relevancy, 0) + IFNULL(exact_relevancy, 0) * 3) / 4
I didn't try this myself though. Just an idea.
I would add a column that is stripped of all special character's, misspellings, and then upcased (or create a function that compares on text that has been stripped and upcased). That way your relevancy will be consistent.
/*
q and q1 - you table
this query takes too much resources,
make from it update-query ( scheduled task or call it on_save if you develop new system )
*/
SELECT
CASE
WHEN word NOT REGEXP "^[a-zA-Z]+$"
/*many replace with junk characters
or create custom function
or if you have full db access install his https://launchpad.net/mysql-udf-regexp
*/
THEN REPLACE(REPLACE( word, '-', ' ' ), '#', ' ')
ELSE word
END word ,
CASE
WHEN word NOT REGEXP "^[a-zA-Z]+$"
THEN 8
ELSE relevancy
END relevancy
FROM ( SELECT 'my' word,
3 relevancy
UNION
SELECT 'test' word,
1 relevancy
UNION
SELECT 'sentence' word,
1 relevancy
UNION
SELECT 'TST-DFS' word,
10 relevancy
)
q
UNION
SELECT *
FROM ( SELECT 'my' word,
3 relevancy
UNION
SELECT 'test' word,
1 relevancy
UNION
SELECT 'sentence' word,
1 relevancy
UNION
SELECT 'TST-DFS' word,
10 relevancy
)
q1
it is a page coading where query result shows
**i can not use functions by use them work are more easier**
<html>
<head>
</head>
<body>
<?php
//author S_A_KHAN
//date 10/02/2013
$dbcoonect=mysql_connect("127.0.0.1","root");
if (!$dbcoonect)
{
die ('unable to connect'.mysqli_error());
}
else
{
echo "connection successfully <br>";
}
$data_base=mysql_select_db("connect",$dbcoonect);
if ($data_base==FALSE){
die ('unable to connect'.mysqli_error($dbcoonect));
}
else
{
echo "connection successfully done<br>";
***$SQLString = "select * from user where id= " . $_GET["search"] . "";
$QueryResult=mysql_query($SQLString,$dbcoonect);***
echo "<table width='100%' border='1'>\n";
echo "<tr><th bgcolor=gray>Id</th><th bgcolor=gray>Name</th></tr>\n";
while (($Row = mysql_fetch_row($QueryResult)) !== FALSE) {
echo "<tr><td bgcolor=tan>{$Row[0]}</td>";
echo "<td bgcolor=tan>{$Row[1]}</td></tr>";
}
}
?>
</body>
</html>

Categories