Quick MYSQL/PHP question. I'm using a "not-so-strict" search query as a fallback if no results are found with a normal search query, to the tune of:
foreach($find_array as $word) {
clauses[] = "(firstname SOUNDS LIKE '$word%' OR lastname SOUNDS LIKE '$word%')";
}
if (!empty($clauses)) $filter='('.implode(' AND ', $clauses).')';
$query = "SELECT * FROM table WHERE $filter";
Now, I'm using PHP to highlight the results, like:
foreach ($find_array as $term_to_highlight){
foreach ($result as $key => $result_string){
$result[$key]=highlight_stuff($result_string, $term_to_highlight);
}
}
But this method falls on its ass when I don't know what to highlight. Is there any way to find out what the "sound-alike" match is when running that mysql query?
That is to say, if someone searches for "Joan" I want it to highlight "John" instead.
Note that SOUNDS LIKE does not work as you think it does. It is not equivalent to LIKE in MySQL, as it does not support the % wildcard.
This means your query will not find "John David" when searching for "John". This might be acceptable if this is just your fallback, but it is not ideal.
So here is a different suggestion (that might need improvement); first use PHPs soundex() function to find the soundex of the keyword you are looking for.
$soundex = soundex($word);
$soundexPrefix = substr($soundex, 0, 2); // first two characters of soundex
$sql = "SELECT lastname, firstname ".
"FROM table WHERE SOUNDEX(lastname) LIKE '$soundexPrefix%' ".
"OR SOUNDEX(firstname) LIKE '$soundexPrefix%'";
Now you'll have a list of firstnames and lastnames that has a vague similarity in sounding (this might be a lot entries, and you might want to increase the length of the soundex prefix you use for your search). You can then calculate the Levenshtein distance between the soundex of each word and your search term, and sort by that.
Second, you should look at parameterized queries in MySQL, to avoid SQL injection bugs.
The SOUND LIKE condition just compares the SOUNDEX key of both words, and you can use the PHP soundex() function to generate the same key.
So, if you found a matching row and needed to find out which word to highlight, you can fetch both the firstname and lastname, and then use PHP to find which one matches and highlight just that word.
I made this code just to try this out. (Had to test my theory xD)
<?php
// A space seperated string of keywords, presumably from a search box somewhere.
$search_string = 'John Doe';
// Create a data array to contain the keywords and their matches.
// Keywords are grouped by their soundex keys.
$data = array();
foreach(explode(' ', $search_string) as $_word) {
$data[soundex($_word)]['keywords'][] = $_word;
}
// Execute a query to find all rows matching the soundex keys for the words.
$soundex_list = "'". implode("','", array_keys($data)) ."'";
$sql = "SELECT id, firstname, lastname
FROM sounds_like
WHERE SOUNDEX(firstname) IN({$soundex_list})
OR SOUNDEX(lastname) IN({$soundex_list})";
$sql_result = $dbLink->query($sql);
// Add the matches to their respective soundex key in the data array.
// This checks which word matched, the first or last name, and tags
// that word as the match so it can be highlighted later.
if($sql_result) {
while($_row = $sql_result->fetch_assoc()) {
foreach($data as $_soundex => &$_elem) {
if(soundex($_row['firstname']) == $_soundex) {
$_row['matches'] = 'firstname';
$_elem['matches'][] = $_row;
}
else if(soundex($_row['lastname']) == $_soundex) {
$_row['matches'] = 'lastname';
$_elem['matches'][] = $_row;
}
}
}
}
// Print the results as a simple text list.
header('content-type: text/plain');
echo "-- Possible results --\n";
foreach($data as $_group) {
// Print the keywords for this group's soundex key.
$keyword_list = "'". implode("', '", $_group['keywords']) ."'";
echo "For keywords: {$keyword_list}\n";
// Print all the matches for this group, if any.
if(isset($_group['matches']) && count($_group['matches']) > 0) {
foreach($_group['matches'] as $_match) {
// Highlight the matching word by encapsulatin it in dashes.
if($_match['matches'] == 'firstname') {
$_match['firstname'] = "-{$_match['firstname']}-";
}
else {
$_match['lastname'] = "-{$_match['lastname']}-";
}
echo " #{$_match['id']}: {$_match['firstname']} {$_match['lastname']}\n";
}
}
else {
echo " No matches.\n";
}
}
?>
A more generalized function, to pull out the matching soundex word from a strings could look like:
<?php
/**
* Attempts to find the first word in the $heystack that is a soundex
* match for the $needle.
*/
function find_soundex_match($heystack, $needle) {
$words = explode(' ', $heystack);
$needle_soundex = soundex($needle);
foreach($words as $_word) {
if(soundex($_word) == $needle_soundex) {
return $_word;
}
}
return false;
}
?>
Which, if I am understanding it correctly, could be used in your previously posted code as:
foreach ($find_array as $term_to_highlight){
foreach ($result as $key => $result_string){
$match_to_highlight = find_soundex_match($result_string, $term_to_highlight);
$result[$key]=highlight_stuff($result_string, $match_to_highlight);
}
}
This wouldn't be as efficient tho, as the more targeted code in the first snippet.
Related
I have in mysql database table column keywords there are csv keywords like "hotel, new hotel, good hotel".
Now when user enter hotel it works(select data) but not for hotels(it shouldn't). Now I want user enter hotels then it should also match hotel keyword.
In-short with suffix search should work. currently i implemented following.
$queried = trim(mysqli_real_escape_string($con,$_POST['query']));
$keys = explode(" ",$queried);
$sql = 'SELECT name FROM image WHERE keyword LIKE "%$queried%"';
foreach($keys as $k){
$k= trim(mysqli_real_escape_string($con,$k));
if(count($keys) > 1)
{
$sql .= ' OR keyword LIKE "%$k%" ';
}
}
you'd have to (additionally) ask whether the search term contains any of the words in the rows. Currently you're doing the opposite (which is fine for the opposite situation, so don't get rid of it)
Something like:
$sql = 'SELECT name FROM image WHERE $queried LIKE "%" + keyword + "%"';
(Apologies if MySQL syntax isn't quite right, not used it for a while).
It might occasionally throw up unwanted things though, e.g. if the user wrote "aparthotel" it'd still return "hotel", you may or may not want that. Or it could even something entirely irrelevant depending on the words involved.
Once you get onto anything more complex than that though, you're probably into the realms of search engines and natural language processing.
i did this way it's not what i want but it works for my criteria.
$suffix = array('','s','es','ing','ment'); // suffix you want to ad
$sql = 'SELECT name FROM image WHERE keyword LIKE "%$queried%"';
foreach($keys as $k)
{
$k= trim(mysqli_real_escape_string($con,$k));
for ($i=1; $i < sizeof($suf) ; $i++)
{
if(substr($k, (-1 * strlen($suf[$i])))==$suf[$i])
{
$wp=substr( $k, 0, (-1 * $i));
}
}
if($wp!="")
{
$sql .= " OR keyword LIKE '%$k%' OR keyword LIKE '%$wp%' ";
}
}
My code let me perform search, as long as the order of the words is correct.
Let's say I'm searching for big dog, but I also want to search for dog big. It get more complicated with 3 or more words.
Is there a way to create a SQL query which would let me search through values with any order?
Only way I can think of this is by having multiple queries, where I change order of PHP variables manually...
<?php
if(isset($_GET['query']) && !empty($_GET['query'])) {
$query = $_GET['query'];
$query_array = explode(' ', $query);
$query_string = '';
$query_counter = 1;
foreach($query_array as $word) {
$query_string .= '%' . $word . (count($query_string) == $query_counter++ ? '%' : '');
}
$query = "SELECT * FROM pages WHERE Name LIKE '$query_string'";
$result = sqlsrv_query($cms->conn, $query);
while($row = sqlsrv_fetch_array($result)) {
extract($row);
echo ''.$Name.'<br>';
}
sqlsrv_free_stmt($stmt);
}
else {
//echo 'NO GET';
}
?>
You could assemble your conditions and check for each word on it's own:
$query_array = explode(' ', $query);
$queryParts = array();
foreach ($query_arra AS $value){
$queryParts[]="Name like '%".mysql_real_escape_string($value)."%'";
}
$searchString = implode(" AND ", $queryParts);
The Search string would now be Name like '%big%' AND Name like '%dog%' ... depending on how much search-keywords have been there.
I use the same approach very often, also when it is required that ALL keywords appear in at least ONE of the columns. Then you need one more loop to create the required AND conditions:
$search = "Big Dog";
$keywords = explode (" ", $search);
$columns = array("Name", "description");
$andParts = array();
foreach ($keywords AS $keyword){
$orParts = array();
foreach($columns AS $column){
$orParts[] = $column . " LIKE '%" . mysql_real_escape_string($keyword) . "%'";
}
$andParts[]= "(" . implode($orParts, " OR ") . ")";
}
$and = implode ($andParts, " AND ");
echo $and;
this would produce the query part (Name like '%Big%' OR description like '%Big%') AND (Name like '%Dog%' or description like '%Dog%')
So, it will find any row, where dog and big are appearing in at least one of the columns name or description (could also be both in one column)
Since your original querystring is something like %big%dog%, so I assume you are okay with matching big wild dog. In this case, you can just use the AND operator.
(Name LIKE '%big%" and Name LIKE '%dog%")
myisam supports full text search:
http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html
One thing you could look into is Full Text Search for ms sql server.
https://msdn.microsoft.com/en-us/library/ms142571.aspx
it's similar to a "search engine" in that it works off of an algorithm to rank results and even similar words (think thesaurus type lookups)
It's not exactly trivial to set up, but it's easy enough to find a tutorial on the subject and how to query from FTS (as the syntax is different than say LIKE '%big%dog%')
Here's a sample query from the page linked above:
SELECT product_id
FROM products
WHERE CONTAINS(product_description, ”Snap Happy 100EZ” OR FORMSOF(THESAURUS,’Snap Happy’) OR ‘100EZ’)
AND product_cost < 200 ;
I have been working on this for a few days now and I cannot seem to meet these requirements while properly using prepared statements.
Here are my requirements:
User must be able to select whole or partial word matches.
User must be able to enter single or multiple keywords into the search form.
Keywords must be highlighted in results and reflect whole or partial keywords entered.
Must make proper use of prepared statements.
I have this all working in Firefox (I have not performed cross browser checks yet) however I don't know how to meet requirement #4.
Here is my code...
<?php
$checkbox = "false"; // Checkbox selected in form. Use literal string for value of checkbox.
$keywords = "whole, part";// Search terms entered into form, separated by commas. Use combination of known whole words and partial matches for testing.
$keywords = str_replace(" ","",$keywords);// Close all empty spaces to meet syntax requirements.
$keywords = explode(",",$keywords); // If there are multiple keywords, convert them into an array, otherwise is_array appears to return single words as 'false'.
if($checkbox == "true") // If "whole words only" is checked
{
if(is_array($keywords))
{
$search = "'[[:<:]]".implode("[[:>:]]|[[:<:]]",$keywords)."[[:>:]]'"; // Whole word REGEXP search for multiple keywords, separated by vertical bar to meet syntax requirements.
}
else
{
$search = "'[[:<:]]".$keywords."[[:>:]]'"; // Whole word REGEXP search for single keywords.
}
}
else
{
if(is_array($keywords))
{
$search = "'".implode("|",$keywords)."'"; // Any word REGEXP search for multiple keywords, separated by vertical bar to meet syntax requirements.
}
else
{
$search = "'".$keywords."'"; // Any word REGEXP search for single keywords.
}
}
echo $search; // Error checking
$sql = ("SELECT column FROM table WHERE column REGEXP $search");
if(!$result_sql = $mysqli->query($sql))
{
echo "It broke"; // Change this to something more appropriate.
}
while($result = $result_sql->fetch_assoc())
{
$column = $result['column'];
if($checkbox == "true") // If "whole words only" is checked
{
if(is_array($keywords))
{
foreach($keywords as $word)
{
$pattern = "/\b".$word."\b/i"; // Adds '\b' before and after the word for whole words only.
$column = preg_replace($pattern,"<span class=\"highlight\">".$word."</span>", $column); // Applies the highlight CSS to each word found.
}
}
else
{
$pattern = "/\b".$word."\b/i"; // // Adds '\b' before and after the word for whole words only.
$column = preg_replace($pattern,"<span class=\"highlight\">".$word."</span>", $column); // Applies the highlight CSS to each word found.
}
}
else
{
if(is_array($keywords))
{
foreach($keywords as $word)
{
$pattern = "/".$word."/i";
$column = preg_replace($pattern,"<span class=\"highlight\">".$word."</span>", $column); // Applies the highlight CSS to each word found.
}
}
else
{
$pattern = "/".$word."/i";
$column = preg_replace($pattern,"<span class=\"highlight\">".$word."</span>", $column); // Applies the highlight CSS to each word found.
}
}
echo "column: ".$column;
}
?>
EDIT :
Found one problem. If I search for the word "light" as a partial match, it picks up on class="highlight". I suppose the only answer to that is to use some gibberish for a class name?
How do I go about getting the most popular words from multiple content tables in PHP/MySQL.
For example, I have a table forum_post with forum post; this contains a subject and content.
Besides these I have multiple other tables with different fields which could also contain content to be analysed.
I would probably myself go fetch all the content, strip (possible) html explode the string on spaces. remove quotes and comma's etc. and just count the words which are not common by saving an array whilst running through all the words.
My main question is if someone knows of a method which might be easier or faster.
I couldn't seem to find any helpful answers about this it might be the wrong search patterns.
Somebody's already done it.
The magic you're looking for is a php function called str_word_count().
In my example code below, if you get a lot of extraneous words from this you'll need to write custom stripping to remove them. Additionally you'll want to strip all of the html tags from the words and other characters as well.
I use something similar to this for keyword generation (obviously that code is proprietary). In short we're taking provided text, we're checking the word frequency and if the words come up in order we're sorting them in an array based on priority. So the most frequent words will be first in the output. We're not counting words that only occur once.
<?php
$text = "your text.";
//Setup the array for storing word counts
$freqData = array();
foreach( str_word_count( $text, 1 ) as $words ){
// For each word found in the frequency table, increment its value by one
array_key_exists( $words, $freqData ) ? $freqData[ $words ]++ : $freqData[ $words ] = 1;
}
$list = '';
arsort($freqData);
foreach ($freqData as $word=>$count){
if ($count > 2){
$list .= "$word ";
}
}
if (empty($list)){
$list = "Not enough duplicate words for popularity contest.";
}
echo $list;
?>
I see you've accepted an answer, but I want to give you an alternative that might be more flexible in a sense: (Decide for yourself :-)) I've not tested the code, but I think you get the picture. $dbh is a PDO connection object. It's then up to you what you want to do with the resulting $words array.
<?php
$words = array();
$tableName = 'party'; //The name of the table
countWordsFromTable($words, $tableName)
$tableName = 'party2'; //The name of the table
countWordsFromTable($words, $tableName)
//Example output array:
/*
$words['word'][0] = 'happy'; //Happy from table party
$words['wordcount'][0] = 5;
$words['word'][1] = 'bulldog'; //Bulldog from table party2
$words['wordcount'][1] = 15;
$words['word'][2] = 'pokerface'; //Pokerface from table party2
$words['wordcount'][2] = 2;
*/
$maxValues = array_keys($words, max($words)); //Get all keys with indexes of max values of $words-array
$popularIndex = $maxValues[0]; //Get only one value...
$mostPopularWord = $words[$popularIndex];
function countWordsFromTable(&$words, $tableName) {
//Get all fields from specific table
$q = $dbh->prepare("DESCRIBE :tableName");
$q->execute(array(':tableName' = > $tableName));
$tableFields = $q->fetchAll(PDO::FETCH_COLUMN);
//Go through all fields and store count of words and their content in array $words
foreach($tableFields as $dbCol) {
$wordCountQuery = "SELECT :dbCol as word, LENGTH(:dbCol) - LENGTH(REPLACE(:dbCol, ' ', ''))+1 AS wordcount FROM :tableName"; //Get count and the content of words from every column in db
$q = $dbh->prepare($wordCountQuery);
$q->execute(array(':dbCol' = > $dbCol));
$wrds = $q->fetchAll(PDO::FETCH_ASSOC);
//Add result to array $words
foreach($wrds as $w) {
$words['word'][] = $w['word'];
$words['wordcount'][] = $w['wordcount'];
}
}
}
?>
I'm trying to create an Advanced Searching form that sort of look like this ;
http://img805.imageshack.us/img805/7162/30989114.jpg
but what should I write for the query?
I know how to do it if there is only two text box but three, there's too many probability that user will do.
$query = "SELECT * FROM server WHERE ???";
What should I write for the "???"
I know how to use AND OR in the query but lets say if the user only fill two of the textbox and one empty. If I write something like this ;
$query = "SELECT * FROM server WHERE model='".$model."' and brand='".$brand."' and SN='".$SN.'" ";
The result will return as empty set. I want the user can choose whether to fill one,two or three of the criteria. If I use OR, the result will not be accurate because if Model have two data with the same name (For example :M4000) but different brand (For example : IBM and SUN). If I use OR and the user wants to search M4000 and SUN, it will display both of the M4000. That's why it is not accurate.
If the user can decide how many criteria he wants to enter for your search and you want to combine those criteria (only those actually filled by the user), then you must dynamically create your SQL query to include only those fields in the search that are filled by the user. I'll give you an example.
The code for a simple search form could look like this:
$search_fields = Array(
// field name => label
'model' => 'Model',
'serialNum' => 'Serial Number',
'brand' => 'Brand Name'
);
echo "<form method=\"POST\">";
foreach ($search_fields as $field => $label) {
echo "$label: <input name=\"search[$field]\"><br>";
}
echo "<input type=\"submit\">";
echo "</form>";
And the code for an actual search like this:
if (isset($_POST['search']) && is_array($_POST['search'])) {
// escape against SQL injection
$search = array_filter($_POST['search'], 'mysql_real_escape_string');
// build SQL
$search_parts = array();
foreach ($search as $field => $value) {
if ($value) {
$search_parts[] = "$field LIKE '%$value%'";
}
}
$sql = "SELECT * FROM table WHERE " . implode(' AND ', $search_parts);
// do query here
}
else {
echo "please enter some search criteria!";
}
In the above code we dynamically build the SQL string to do a search ("AND") for only the criteria entered.
Try this code
<?php
$model="";
$brand="";
$serialNum="";
$model=mysql_real_escape_string($_POST['model']);
$brand=mysql_real_escape_string($_POST['brand']);
$serialNum=mysql_real_escape_string($_POST['serialNum']);
$query=" select * from server";
$where_str=" where ";
if($model == "" && $brand == "" && $serialNum == "")
{
rtrim($where_str, " whrere ");
}
else
{
if($model != "")
{
$where_str.= " model like '%$model%' AND ";
}
if($brand != "")
{
$where_str.= " brand like '%$brand%' AND ";
}
if($serialNum != "")
{
$where_str.= " serialNum like '%$serialNum%' AND ";
}
rtrim($where_str, " AND ");
}
$query.= $where_str;
$records=mysql_query($query);
?>
For those framiliar with mysql, it offers the ability to search by regular expressions (posix style). I needed an advanced way of searching in php, and my backend was mysql, so this was the logical choice. Problem is, how do I build a whole mysql query based on the input? Here's the type of queries I wanted to be able to process:
exact word matches
sub-string matches (I was doing this with like "%WORD%")
exclude via sub-string match
exclude via exact word match
A simple regexp query looks like:
select * from TABLE where ROW regexp '[[:<:]]bla[[:>:]]' and ROW
regexp 'foo';
This will look for an exact match of the string "bla", meaning not as a sub-string, and then match the sub-string "foo" somewhere.
So first off, items 1 and 4 are exact word matches and I want to be able to do this by surrounding the word with quotes. Let's set our necessary variables and then do a match on quotes:
$newq = $query; # $query is the raw query string
$qlevel = 0;
$curquery = "select * from TABLE where "; # the beginning of the query
$doneg = 0;
preg_match_all("/\"([^\"]*)\"/i", $query, $m);
$c = count($m[0]);
for ($i = 0; $i < $c; $i++) {
$temp = $m[1][$i]; # $temp is whats inside the quotes
Then I want to be able to exclude words, and the user should be able to do this by starting the word with a dash (-), and for exact word matches this has to be inside the quotes. The second match is to get rid of the - in front of the query.
if (ereg("^-", $temp)) {
$pc = preg_match("/-([^-]*)/i", $m[1][$i], $dm);
if ($pc) {
$temp = $dm[1];
}
$doneg++;
}
Now we will set $temp to the posix compliant exact match, then build this part of the mysql query.
$temp = "[[:<:]]".$temp."[[:>:]]";
if ($qlevel) $curquery .= "and "; # are we nested?
$curquery .= "ROW "; # the mysql row we are searching in
if ($doneg) $curquery .= "not "; # if dash in front, do not
$curquery .= "regexp ".quote_smart($temp)." ";
$qlevel++;
$doneg = 0;
$newq = ereg_replace($m[0][$i], "", $newq);
}
The variable $newq has the rest of the search string, minus everything in quotes, so whatever remains are sub-string search items falling under 2 and 3. Now we can go through what is left and basically do the same thing as above.
$s = preg_split("/\s+/", $newq, -1, PREG_SPLIT_NO_EMPTY); #whitespaces
for ($i = 0; $i < count($s); $i++) {
if (ereg("^-", $s[$i])) { # exclude
sscanf($s[$i], "-%s", $temp); # this is poor
$s[$i] = $temp;
$doneg++;
}
if ($qlevel) $curquery .= "and ";
$curquery .= "ROW "; # the mysql row we are searching in
if ($doneg) $curquery .= "not ";
$curquery .= "regexp ".quote_smart($s[$i])." ";
$qlevel++;
$doneg = 0;
}
# use $curquery here in database
The variable $curquery now contains our built mysql query. You will notice the use of quote_smart in here, this is a mysql best practice from php.net. It's the only mention of security anywhere in this code. You will need to run your own checking against the input to make sure there are no bad characters, mine only allows alpha-numerics and a few others. DO NOT use this code as is without first fixing that.
You have to provide $model, $brand, $serial which come from your search-form.
$query = "SELECT * FROM `TABLE` WHERE `model` LIKE '%$model%' AND `brand` LIKE '%$brand%' AND `serial` LIKE '%$serial%'";
Also take a look at the mysql doc
http://dev.mysql.com/doc/refman/5.1/en/string-comparison-functions.html
A basic search would work like this:
"SELECT * FROM server WHERE column_name1 LIKE '%keyword1%' AND column_name2 LIKE '%keyword2%' .....";
This would be case for matching all parameters.For matching any one of the criteria, change ANDs to ORs