Database search like google [duplicate] - php

This question already has answers here:
Google-like Search Engine in PHP/mySQL [closed]
(9 answers)
Closed 1 year ago.
I currently have a search option on my PHP+MYSQL website.
The MYSQL query is currently something like "SELECT pageurl WHERE name LIKE '%$query%'.
The reason I posted here is because I noticed that if the name of one of my products is "Blue Bike" and someone looks for "Bike Blue", no results are returned.
I am looking for a solution to this because I know that if I type on google same word, something appears.
I was thinking to create a PHP function to mix up all the words from the query if the query is having 4 or fewer words, generating around 24 queries.
Is there an easier solution to this?
Thanks for your time

As to not let this go without a working answer:
<?php
$search = 'this is my search';
$searchSplit = explode(' ', $search);
$searchQueryItems = array();
foreach ($searchSplit as $searchTerm) {
/*
* NOTE: Check out the DB connections escaping part
* below for the one you should use.
*/
$searchQueryItems[] = "name LIKE '%" . mysqli_real_escape_string($searchTerm) . "%'";
}
$query = 'SELECT pageurl FROM names' . (!empty($searchQueryItems) ? ' WHERE ' . implode(' AND ', $searchQueryItems) : '');
?>
DB connections escaping
mysqli_:
Keep using mysqli_real_escape_string or use $mysqli->real_escape_string($searchTerm).
mysql_:
if you use mysql_ you should use mysql_real_escape_string($searchTerm) (and think about changing as it's deprecated).
PDO:
If you use PDO, you should use trim($pdo->quote($searchTerm), "'").

use full text search instead of like
full text search based on indexed text and is very faster and beter than using like.
see this article for more information about full text search

What you are looking for is fulltext search.
Try Sphinx, it is very fast and integrates well with MySQL.
Sphinx website

I wrote a function that approaches Google's operation taking into account the double quotes for the elements to search as a whole block. It does NOT take into account the - or * instructions.
table: MySQL table to consider
cols: array of column to parse
searchParams: search to process. For example: red mustang "Florida 90210"
function naturalQueryConstructor($table, $cols, $searchParams) {
// Basic processing and controls
$searchParams = strip_tags($searchParams);
if( (!$table) or (!is_array($cols)) or (!$searchParams) ) {
return NULL;
}
// Start query
$query = "SELECT * FROM $table WHERE ";
// Explode search criteria taking into account the double quotes
$searchParams = str_getcsv($searchParams, ' ');
// Query writing
foreach($searchParams as $param) {
if(strpos($param, ' ') or (strlen($param)<4)) {
// Elements with space were between double quotes and must be processed with LIKE.
// Also for the elements with less than 4 characters. (red and "Florida 90210")
$query .= "(";
// Add each column
foreach($cols as $col) {
if($col) {
$query .= $col." LIKE '%".$param."%' OR ";
}
}
// Remove last ' OR ' sequence
$query = substr($query, 0, strlen($query)-4);
// Following criteria will added with an AND
$query .= ") AND ";
} else {
// Other criteria processed with MATCH AGAINST (mustang)
$query .= "(MATCH (";
foreach($cols as $col) {
if($col) {
$query .= $col.",";
}
}
// Remove the last ,
$query = substr($query, 0, strlen($query)-1);
// Following criteria will added with an AND
$query .= ") AGAINST ('".$param."' IN NATURAL LANGUAGE MODE)) AND ";
}
}
// Remove last ' AND ' sequence
$query = substr($query, 0, strlen($query)-5);
return $query;
}
Thanks to the stackoverflow community where I found parts of this function!

To have a google like search you'd need many database and index nodes, crazy algorithms.. now you come up with a SELECT LIKE ... lol :D
MySQL is slow in searching, you'd need fulltext and index set properly (MyISAM or Aria Engine). Combinations or different entities to search for are almost not implementable properly AND fast.
I'd suggest to setup an Elasticsearch server which is based on Apache's Lucene.
This searchs very fast and is easy to maintain. And you would not have to care about SQL injection and can still use the mysql server fast.
Elasticsearch (or other Lucene based search engines like SolR) can easily be installed on any server because they are written in Java.
Good documentation:
http://www.elasticsearch.org/guide/en/elasticsearch/client/php-api/current/

I would do an explode first:
$queryArray = explode(" ", $query);
and then generate the SQL query something like:
for ($i=0; $i< count($queryArray); $i++) {
$filter += " LIKE '%" + $queryArray[$i] + "%' AND" ;
}
$filter = rtrim ($filter, " AND");
$sql = "SELECT pageurl FROM ... WHERE name " + $filter
(note: haven't tested/run this code)

Related

php and mysql live search

I'm currently working on a live search that displays results directly from a mysql db.
The code works, but not really as i want it.
Let's start with an example so that it is easier to understand:
My database has 5 columns:
id, link, description, try, keywords
The script that runs the ajax request on key up is the following:
$("#searchid").keyup(function () {
var searchid = encodeURIComponent($.trim($(this).val()));
var dataString = 'search=' + searchid;
if (searchid != '') {
$.ajax({
type: "POST",
url: "results.php",
data: dataString,
cache: false,
success: function (html) {
$("#result").html(html).show();
}
});
}
return false;
});
});
on the results.php file looks like this:
if ($db->connect_errno > 0) {
die('Unable to connect to database [' . $db->connect_error . ']');
}
if ($_REQUEST) {
$q = $_REQUEST['search'];
$sql_res = "select link, description, resources, keyword from _db where description like '%$q%' or keyword like '%$q%'";
$result = mysqli_query($db, $sql_res) or die(mysqli_error($db));
if (mysqli_num_rows($result) == 0) {
$display = '<div id="explainMessage" class="explainMessage">Sorry, no results found</div>';
echo $display;
} else {
while ($row = $result->fetch_assoc()) {
$link = $row['link'];
$description = $row['description'];
$keyword = $row['keyword'];
$b_description = '<strong>' . $q . '</strong>';
$b_keyword = '<strong>' . $q . '</strong>';
$final_description = str_ireplace($q, $b_description, $description);
$final_keyword = str_ireplace($q, $b_keyword, $keyword);
$display = '<div class="results" id="dbResults">
<div>
<div class="center"><span class="">Description :</span><span class="displayResult">' . $final_description . '</span></div>
<div class="right"><span class="">Keyword :</span><span class="displayResult">' . $final_keyword . '</span></div>
</div>
<hr>
</div>
</div>';
echo $display;
}
}
}
now, let's say that i have this row in my DB:
id = 1
link = google.com
description = it's google
totry = 0
keywords: google, test, search
if i type in the search bar:
google, test
i have the right result, but if i type:
test, google
i have no results, as obviously the order is wrong.
So basically, what o'd like to achieve is something a bit more like "tags", so that i can search for the right keywords without having to use the right order.
Can i do it with my current code (if yes, how?) or i need to change something?
thanks in advance for any suggestion.
PS: I know this is not the best way to read from a DB as it has some security issues, i'm going to change it later as this is an old script that i wrote ages ago, i'm more interested in have this to work properly, and i'm going to change method after.
Normalize your schema
The rules of relational database are very simple (at least the first three).
keywords: google, test, search
...breaks the second rule. Each keyword should be in its own row in a related table. Then you can simply write your query as....
SELECT link, description, resources, keyword
FROM _db
INNER JOIN keywords
ON _db.id=keywords.db_id
WHERE keyword.value IN (" . atomize($q) . ")
(where atomize explodes the query string, applies mysqli_escape_paramter() to each entry to avoid breaking your code, encloses each term in single quotes and concatenates the result).
Alternatively you could use MySQL's full text indexing which does this for you transparently.
Although hurricane makes some good points in his/her answer, they do not mention that none of the solutions proposed there does not scale to handle large volumes of data with any efficiency (decomposing the field into a new table/using full text indexing does).
Untested code but modify according to your needs,
$q = $_REQUEST['search'];
$q_comma = explode(",", $q);
$where_in_set = '';
$count = count($q_comma);
foreach( $q_comma as $q)
{
$counter++;
if($counter == $count) {
$where_in_set .= "FIND_IN_SET('$q','keywords')";
}else {
$where_in_set .= "FIND_IN_SET('$q','keywords') OR ";
}
}
$sql_res = "select link, description, resources, keyword from _db where $where_in_set or description like '%$q%'";
There are 2 solutions I can think of:
Use fulltext index and search.
You can split the search string into words in php for example using explode() and serach for the words not in a single serach criteria, but in separate ones. This latter one can be very resource intensive, since you are seraching in multiple fields.
LIKE '%google, test%' will match id=1 but not '%google,test%' (no space between coma) nor '%google test%' (space delimiter) nor '%test, google%'. Put each keyword as separate table or you can split input keywords into several single keyword and use OR operator such as LIKE 'google%' OR LIKE 'test%'
Not an ideal solution, but instead of treating your search datastring as one element, you can have php treat it as an array of keywords separated by a comma (by using explode). You'd then build a query depending on how many keywords were sent.
For example, using "google, test" your query would be:
$sql_res = "select link, description, resources, keyword from _db where (description like '%$q1%' or keyword like '%$q1%') AND (description like '%$q2%' or keyword like '%$q2%')";
Where $q1 and $q2 are "google" and "test".
First of all as you say it is not a good way to do it. I think you are writing a autocompleter.
Seperators for words
"google, test" or "test, google" is a attached words. First you need to define a seperator for users. Usually it is a whitespace ' '.
When you define it you need to split words.
$words = explode(" ",$q);
// now you get two words "google," and "test"
Then you need to create a sql which gives you multiple search chance.
There are a lot example in MySQL LIKE IN()?
Now you get your result.
Text similarity
Select all result from db and in a while search a text from another text. It gives you a dobule point for similarity. Best result is your result.
Php Similarity Example
Important Info
If you ask my opinion don't use it like that bcs it is very expensive. Use autocompleters on html side. Here is an example

MySQL query that can handle missing characters

I'm trying to improve my MySQL query.
SELECT gamename
FROM giveaway
WHERE gamename LIKE '$query'
I got an input that consists of URL's that are formed like:
http://www.steamgifts.com/giveaway/l7Jlj/plain-sight
http://www.steamgifts.com/giveaway/okjzc/tex-murphy-martian-memorandum
http://www.steamgifts.com/giveaway/RqIqD/flyn
http://www.steamgifts.com/giveaway/FzJBC/penguins-arena-sednas-world
I take the game name from the URL and use this as input for a SQL query.
$query = "plain sight"
$query = "tex murphy martian memorandum"
$query = "flyn"
$query = "penguins arena sednas world"
Now in the database the matching name sometimes has more characters like : ' !, etc.
Example:
"Plain Sight"
"Tex Murphy: Martian Memorandum"
"Fly'N"
"Penguins Arena: Sedna's World!"
So when putting in the acquired name from the URL this doesn't produce results for the 2nd, 3rd and 4th example.
So what I did was use a % character.
$query = "plain%sight"
$query = "tex%murphy%martian%memorandum"
$query = "flyn"
$query = "penguins%arena%sednas%world"
This now gives result on the 1st and 2nd example.
.
On to my question:
My question is, how to better improve this so that also the 3rd and 4th ones work?
I'm thinking about adding extra % before and after each character:
$query = "%f%l%y%n%"
$query = "%p%e%n%g%u%i%n%s%a%r%e%n%a%s%e%d%n%a%s%w%o%r%l%d%"
But I'm not sure how that would go performance wise and if this is the best solution for it.
Is adding % a good solution?
Any other tips on how to make a good working query?
Progress:
After a bit of testing I found that adding lots of wildcards (%) is not a good idea. You will get returned unexpected results from the database, simply because you just added a lot of ways things could match.
Using the slug method seems to be the only option.
If i get your question well, you are creating a way of searching through those informations. And if that is the case then try
$query = addslashes($query);
SELECT name
FROM giveaway
WHERE gamename LIKE '%$query%'
Now if you want to enlarge your search and search for every single word that looks like the words in your string, then you can explode the text and search for each word by doing
<?php
$query = addslashes($query);
//We explode the query into a table
$tableau=explode(' ',$query);
$compter_tableau=count($tableau);
//We prepare the query
$req_search = "SELECT name FROM giveaway WHERE ";
//we add the percentage sign and the combine each query
for ($i = 0; $i < $compter_tableau; $i++)
{
$notremotchercher=$tableau["$i"];
if($i==$compter_tableau) { $liaison="AND"; } else { $liaison=""; }
if($i!=0) { $debutliaison="AND"; } else { $debutliaison=""; }
$req_search .= "$debutliaison gamename LIKE '%$notremotchercher%' $liaison ";
}
//Now you lauch your query here
$selection=mysqli_query($link, "$req_search") or die(mysqli_error($link));
?>
By so doing you would have added the % to every word in your query which will give you more result that you can choose from.

Multiple OR terms in WHERE clause

Ii am building a poor-man's search engine for in-house use. Back when I was doing this sort of thing using MySQLi or a pgsql db in code, I would write something along the lines of this:
$sql = "SELECT * FROM $table ";
$WC = array();
foreach($columns as $col )
{
$WC[] = $col ." like '%".$term."%'";
}
$sql .= ' WHERE ('. implode(') OR (', $WC). ')';
return $sql;
which would concatenate a number (unknown) of column names with the search term. The resulting SQL looking somewhat like this:
SELECT * FROM books WHERE (title like '%kids%') OR (publisher like '%kids%') OR (author like '%kids%');
Fairly crude but effective to a point. Having a small and well known data set allows this sort of thing to be quite useful.
However...
I cannot for the life of me figure out how to append an arbitrary number of OR clauses to a DB::table('books')-> ... construct. Documentation completely skirts this sort of construction. Can anyone point me in the right direction?
Just call orWhere repeatedly on the query object:
$query = DB::table('books');
foreach ($columns as $column)
{
$query->orWhere($column, 'like', "%{$term}%");
}
return $query;
You should better use MATCH AGAINST for that search.

PHP Search Using Multiple Words

I am trying to create a search function where a user can input two words into a text field and it will split the words and construct a MySQL query.
This is what I have so far.
$search = mysql_real_escape_string( $_POST['text_field']);
$search = explode(" ", $search);
foreach($search as $word)
{
$where = "";
$where .= "product_code LIKE '%". $word ."%'";
$where .= "OR description LIKE '%". $word ."%'";
$query = "SELECT * FROM customers WHERE $where";
$result = mysql_query($query) or die();
if(mysql_num_rows($result))
{
while($row = mysql_fetch_assoc($result))
{
$customer['value'] = $row['id'];
$customer['label'] = "{$row['id']}, {$row['name']} {$row['age']}";
$matches[] = $customer;
}
}
else
{
$customer['value'] = "";
$customer['label'] = "No matches found.";
$matches[] = $customer;
}
}
$matches = array_slice($matches, 0, 5); //return only 5 results
It constructs and runs the query, but returns funny results.
Any help would be appreciated.
MySQL has something called LIMIT, so you last row would be needless.
Use Full-Text-Search for this: http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html - It's faster and more elegant
If your database is on MyISAM table format you could do a Fulltext search on the columns you are interested as Sn0opy mentioned already
Personally I believe that when it comes to mySQL if you actually want to create a great search engine use Sphinx (http://sphinxsearch.com/) or Solr (http://lucene.apache.org/solr/)
There may be a learning curve on both of them, but the results are professional.
Any chance of anything more specific than "funny results"? Off the cuff there are several possibilities but it really depends upon the results that are being returned. My PHP is a bit rusty so I will apologize up front if my brain throws in some java rules instead, but at first blush...
Name the array something other than $search. It probably isn't the problem, but it looks odd to have the array created by explode() carry the name of the string being exploded. Try something like $searched = explode(" ", $search); and then use $searched in the subsequent foreach() loop.
What if the user only puts in one search term? If there is no space in $text_field then explode will return an empty array, which should thoroughly jack up your query. You should at least verify that there is a space in $text_field before exploding $search. Likewise, what if the user enters two search terms, but one of the terms is two words separated by a space? Again you are going to get "funny results" because you will get results that you don't want along with duplicated results as the query extends itself to both of the words in a term individually.
Without knowing more of what you mean by "funny results" it is really difficult to trouble shoot this one.

Specialized Search Query Refinement for Auto-Complete function

I am doing a query for an autocomplete function on a mysql table that has many instances of similar titles, generally things like different years, such as '2010 Chevrolet Lumina' or 'Chevrolet Lumina 2009', etc.
The query I am currently using is:
$result = mysql_query("SELECT * FROM products WHERE MATCH (name) AGAINST ('$mystring') LIMIT 10", $db);
The $mystring variable gets built as folows:
$queryString = addslashes($_REQUEST['queryString']);
if(strlen($queryString) > 0) {
$array = explode(' ', $queryString);
foreach($array as $var){
$ctr++;
if($ctr == '1'){
$mystring = '"' . $var . '"';
}
else {
$mystring .= ' "' . $var . '"';
}
}
}
What I need to be able to do is somehow group things so only one version of a very similar actually shows in the autosuggest dropdown, leaving room for other products with chevrolet in them as well. Currently it is showing all 10 spots filled with the same product with different years, options, etc.
This one should give some of you brainiacs a good workout :)
I think the best way to do this would be to create a new field on the products table, something like classification. All the models would be entered with the same classification (e.g. "Chevrolet"). You could then still MATCH AGAINST name, but GROUP BY classification. Assuming you are using MySQL you can cheat a little and get away with selecting values and matching against values that you are not grouping by. Technically in SQL this gives undefined results and many SQL engines will not even let you try to do this, but MySQL lets you do it -- and it returns a more-or-less random sample that matches. So, for example, if you did the above query, grouped by classification, only one model (picked pretty much at random) will show up in the auto-completer.

Categories