How to implement a Search Algorithm - php

This is the first time I am writing an actual search feature for my database.
The database consists of hotel names, hotel food items, hotel locations.
I would like the above three to show up during a search of a string.
Are there any common search algorithm or packages that can be used ?
EXPECTED RESULT SET:
id | name | description | table_name | rank
56 | KFC| Fried chicken | hotel | 1
12 | [food item name] | [food item description] | food_item | 2
19 | [hotel name] | [hotel description] | hotel | 3
....

Do you mean a relational database? If yes, your "search" algorithm is a WHERE clause.
Do you mean contextual search? Lucene is a great search engine implementation written in Java. This might help you marry it with Lucene:
http://www.cabotsolutions.com/2009/05/using-solr-lucene-for-full-text-search-with-mysql-db/
The answer is far more complicated if you're thinking about crawling web sites based on some criteria. Please clarify.

If you are using Microsoft SQL Server, FreeText works very well:
http://msdn.microsoft.com/en-us/library/ms176078.aspx

Let's consider you're using mysql.
Well your question is basically: how to write a query that will search hotel name, food items, and hotel location.
I guess theses 3 informations are stored in 3 different tables. The easiest way would be to simply query the 3 tables one after the other with query like theses:
SELECT * FROM hotel WHERE hotel_name LIKE "%foobar%";
SELECT * FROM hotel_food_item WHERE item_name LIKE "%foobar%";
SELECT * FROM hotel_location WHERE hotel_name LIKE "%foobar%" OR street_name LIKE "%foobar%" OR city LIKE "%foobar%";
Make sure your search term are safe from SQL injection
You may (or not) want to group the query into 1 bigger query
If your database is becoming large ( like < 100 000 line per table ), or if you have a lot or search query, you might be interested in creating a search index, or use a dedicated database intend for text search, like elastic search or something else.
Edit:
If relevance is a matter, use MATCH AGAINST:
http://maisonbisson.com/blog/post/10752/making-mysql-do-relevance-ranked-full-text-searches/
http://www.devshed.com/c/a/PHP/Using-Relevance-Rankings-for-Full-Text-and-Boolean-Searches-with-MySQL/
PHP MySQL Search And Order By Relevancy
You'll have to create 3 subqueries that do MATCH AGAINST, and them compile them together. You can do AGAINST("foobar") as rank so you'll have the score you needed.
This should look like:
SELECT *
FROM
(
SELECT id, 'hotel' as table_name, MATCH (search_field1) AGAINST ("lorem") as rank FROM tableA
UNION
SELECT id, 'food' as table_name, MATCH (search_field2) AGAINST ("lorem") as rank FROM tableB
) as res
ORDER BY res.rank DESC

if you are not using innodb table, and instead are using myisam, you can use mysql's built in full text search.
this works by first putting a full-text index on the columns you wish to search, and then creating a query that looks roughly like this:
SELECT *, MATCH(column_to_search) AGAINST($search_string) AS relevance
FROM your_table
WHERE MATCH(keywords) AGAINST($search_string IN BOOLEAN MODE)
ORDER BY relevance
LIMIT 20

Related

mysql like query exclude numbers

I have a small problem with a php mysql query, I am looking for help.
I have a family tree table, where I am storing for each person his/her ancestors id separated by a comma. like so
id ancestors
10 1,3,4,5
So the person of id 10 is fathered by id 5 who is fathered by id 4 who is fathered by 3 etc...
Now I wish to select all the people who have id x in their ancestors, so the query will be something like:
select * from people where ancestors like '%x%'
Now this would work fine except, if id x is lets say 2, and a record has an ancestor id 32, this like query will retrieve 32 because 32 contains 2. And if I use '%,x,%' (include commas) the query will ignore the records whose ancestor x is on either edge(left or right) of the column. It will also ignore the records whose x is the only ancestor since no commas are present.
So in short, I need a like query that looks up an expression that either is surrounded by commas or not surrounded by anything. Or a query that gets the regular expression provided that no numbers are around. And I need it as efficient as possible (I suck at writing regular expressions)
Thank you.
Edit: Okay guys, help me come up with a better schema.
You are not storing your data in a proper way. Anyway, if you still want to use this schema you should use FIND_IN_SET instead of LIKE to avoid undesired results.
SELECT *
FROM mytable
WHERE FIND_IN_SET(2, ancestors) <> 0
You should consider redesigning your database structure. Add new table "ancestors" to database with columns:
id id_person ancestor
1 10 1
2 10 3
3 10 4
After -- use JOIN query with "WHERE IN" to choose right rows.
You're having this issue because of wrong design of database.First DBMS based db's aren't meant for this kind of data,graph based db's are more likely to fit for this kind of solution.
if it contain small amount of data you could use mysql but still the design is still wrong,if you only care about their 'father' then just add a column to person (or what ever you call it) table. if its null - has no father/unknown otherwise - contains (int) of his parent.
In case you need more then just 'father' relationship you could use a pivot table to contain two persons relationship but thats not a simple task to do.
There are a few established ways of storing hierarchical data in RDBMS. I've found this slideshow to be very helpful in the past:
Models for Hierarchical Design
Since the data deals with ancestry - and therefore you wouldn't expect it to change that often - a closure table could fit the bill.
Whatever model you choose, be sure to look around and see if someone else has already implemented it.
You could store your values as a JSON Array
id | ancestors
10 | {"1","3","4","5"}
and then query as follows:
$query = 'select * from people where ancestors like \'%"x"%\'';
Better is of course using a mapping table for your many-to-many relation
You can do this with regexp:
SELECT * FROM mytable WHERE name REGEXP ',?(x),?'
where x is your searched value
DROP TABLE IF EXISTS my_table;
CREATE TABLE my_table
(id INT NOT NULL AUTO_INCREMENT PRIMARY KEY
,ancestors VARCHAR(250) NOT NULL
);
INSERT INTO my_table VALUES(10,',1,3,4,5');
SELECT *
FROM my_table
WHERE CONCAT(ancestors,',') LIKE '%,5,%';
+----+-----------+
| id | ancestors |
+----+-----------+
| 10 | ,1,3,4,5 |
+----+-----------+
SELECT *
FROM my_table
WHERE CONCAT(ancestors,',') LIKE '%,4,%';
+----+-----------+
| id | ancestors |
+----+-----------+
| 10 | ,1,3,4,5 |
+----+-----------+

Partial keyword searching in MySQL

Context:
I am trying to create a search function for my website where a user can type in full sentences and receive results back based on the matching of keywords in the sentence with words stored in a MySQL database:
**ID | Skill**
1 | Painting
2 | Carpenter
3 | Builder
For example a user may search "I want some painting to be done" and using the following MySQL query (along with a foreach and explode function) it will return ID 1 from the database:
$stmt = $mysqli->prepare ("SELECT username FROM users WHERE users.id IN (SELECT
skills.userid FROM skills WHERE skills.skill LIKE CONCAT('%',?,'%') GROUP BY
skills.skill ORDER BY CASE WHEN skills.skill LIKE CONCAT(?,'%') THEN 0 WHEN
skills.skill LIKE CONCAT('% %',?,'% %') THEN 1 WHEN skills.skill LIKE CONCAT('%',?)
THEN 2 ELSE 3 END, skills.skill)");
Exam question:
The issue I have is that if a user was to type "I want a painter" then ID 1 would not be returned. How can the query be modified to account for the fact that painting and painter are similar and so should be returned?
You can add to skills table a column called synonymous with some keywords for that skill.
For example, the "Painting" row will have a "paint painting paintor" in synonumous column.
Then you change your query to check for synonymous column insted of skill column.
This is the simples way, but requires that you put a synonymous to each skills table row.

Use search results (Like %search%), to match id's in another table in the database

I have two tables in the database, parts, and products.
I have a column in the products table with strings of ids (comma separated). Those ids match ids of the parts table.
**parts**
ID | description (I'm searching this part)
-------------------------------
1 | some text here
2 | some different text here
3 | ect...
**products**
ID | parts-list
--------------------------------
1 | 1,2,3
2 | 2,3
3 | 1,2
I'm really struggling with the SQL query on this one.
I've done the 1st part, got the id's from the parts table
SELECT * FROM parts WHERE description LIKE '%{$search}%'
The biggest problem is the comma separated structure of the the description column.
Obviously, I could do it in PHP, create an array of the the results from the parts table, use that to search the products table for id's, and then use those results to grab the row data from the parts table (again). Not very efficient.
I also tried this, but I'm obviously trying to compare two arrays here, not sure how this should be done.
SELECT * FROM `products` WHERE
CONCAT(',', description, ',')
IN (SELECT `id` FROM `parts` WHERE `description` LIKE '%{$search}%')
Can anybody help?
I would perhaps try a combination of LOCATE() and SUBSTR(). I work mainly in MSSQL which has CHARINDEX() that I think works like MySQL's LOCATE(). It is bound to be messy. Are there a variable number of elements in the parts-list field?

MYSQL UNION across 3 tables with ORDER BY

We have this statement:
(SELECT res_bev.bev_id, property_value.name AS priority
FROM res_bev, bev_property, property_value
WHERE res_bev.res_id='$resIn'
AND bev_property.bev_id=res_bev.bev_id
AND bev_property.type_id='23'
AND property_value.id=bev_property.val_id)
UNION
(SELECT res_bev.bev_id, property_value.name as priority
FROM res_bev, bev_property, property_value
WHERE res_bev.res_id='$resIn'
AND bev_property.bev_id=res_bev.bev_id
AND bev_property.type_id='22'
AND property_value.id=bev_property.val_id)
We have Three Tables:
Res_bev
res_id | bev_id | id
Bev_property
type_id | val_id | bev_id | id
Property_value
name | id
What I am looking for is the results to be ordered by glass price(type_id='23') then bottle price(type_id='22') however it seems the union includes duplicates due to fact the first select returns say 3456 | 7.5 and the second returns 3456 | 55 since the price/Glass is 7.5 and the price/Bottle is 55; how can I eliminate these duplicates form the second SQL statement to return and ordered table?
Also, fooled with creating a pseudo-table via left joins to create a table of bev_id | price/Glass | price/Bottle, however since this should be able to expand to multiple price types I figured a UNION would be more efficient. Just a push in the right direction would be helpful.
You can do it in 1 query by specifying bev_property.type_id to match against an IN() clause with the values inside.
To return only the first one found you should require a DISTINCT SELECT of the accompagnying field bev_id.
To ORDER them just add an appropriate descending ORDER BY clause. This should order first and the filter out the second bev_property.type_id value. (Databases never return anything in a specific order unless you tell them to, some might have an internal convention or it might appear they do but this is never guaranteed to be repeatable unless you specify an ORDER BY clause in your SELECT statement. )
SELECT DISTINCT res_bev.bev_id, property_value.name AS priority
FROM res_bev, bev_property, property_value
WHERE res_bev.res_id='$resIn'
AND bev_property.bev_id=res_bev.bev_id
AND bev_property.type_id IN ('23','22')
AND property_value.id=bev_property.val_id
ORDER BY bev_property.type_id DESC;
A UNION won't really be faster since you'd have to do the whole lookup twice and if you don't have this field indexed then you'll do a whole table traversal with match against 1 element twice as opposed do 1 table traversal that matches against 2 elements. (walking over a whole table is what's generally slow, not matching simple elements against each other)
When properly indexed I think you might have a tiny overhead of executing a new select query and the query analyzer running again but I don't know for sure. It'll probably be smart enough to recognise the similarities between the queries so it won't matter.
It doesn't always hurt to try on specific databases though. Whenever you try query optimisation with different statements use them with EXPLAIN, this will show you what the query will be doing and wether it'll go over whole tables, sort data on file, etc...
Unless I'm missing something, or you have from your question
SELECT res_bev.bev_id,
property_value.name AS priority
FROM res_bev, bev_property, property_value
WHERE res_bev.res_id='$resIn'
AND bev_property.bev_id=res_bev.bev_id
AND (bev_property.type_id='23' OR bev_property.type_id='22')
AND property_value.id=bev_property.val_id)
order by bev_property.type_id desc
PS if you want to order a union
try something along the lines of
Select * from
(
select ...
Union
select ...
) somenameforqryinparentheses
Order by Somecolumn1, somecolumn2

Select random row per distinct field value?

I have a MySQL query that results in something like this:
person | some_info
==================
bob | pphsmbf24
bob | rz72nixdy
bob | rbqqarywk
john | kif9adxxn
john | 77tp431p4
john | hx4t0e76j
john | 4yiomqv4i
alex | n25pz8z83
alex | orq9w7c24
alex | beuz1p133
etc...
(This is just a simplified example. In reality there are about 5000 rows in my results).
What I need to do is go through each person in the list (bob, john, alex, etc...) and pull out a row from their set of results. The row I pull out is sort of random but sort of also based on a loose set of conditions. It's not really important to specify the conditions here so I'll just say it's a random row for the example.
Anyways, using PHP, this solution is pretty simple. I make my query and get 5000 rows back and iterate through them pulling out my random row for each person. Easy.
However, I'm wondering if it's possible to get what I would from only a MySQL query so that I don't have to use PHP to iterate through the results and pull out my random rows.
I have a feeling it might involve a BUNCH of subselects, like one for each person, in which case that solution would be more time, resource and bandwidth intensive than my current solution.
Is there a clever query that can accomplish this all in one command?
Here is an SQLFiddle that you can play with.
To get a random value for a distinct name use
SELECT r.name,
(SELECT r1.some_info FROM test AS r1 WHERE r.name=r1.name ORDER BY rand() LIMIT 1) AS 'some_info'
FROM test AS r
GROUP BY r.name ;
Put this query as it stands in your sqlfiddle and it will work
Im using r and r1 as table alias names. This will also use a subquery to select a random some_info for the name
SQL Fiddle is here
My first response would be to use php to generate a random number:
$randId = rand($min, $max);
Then run a SQL query that only gets the record where your index equals $randID.
Here is the solution:
select person, acting from personel where id in (
select lim from
(select count(person) c, min(id) i, cast(rand()*(count(person)-1) +min(id)
as unsigned) lim from personel group by person order by i) t1
)
The table used in the example is below:
create table personel (
id int(11) not null auto_increment,
person char(16),
acting char(19),
primary key(id)
);
insert into personel (person,acting) values
('john','abd'),('john','aabd'),('john','adbd'),('john','abfd'),
('alex','ab2d'),('alex','abd3'),('alex','ab4d'),('alex','a6bd'),
('max','ab2d'),('max','abd3'),('max','ab4d'),('max','a6bd'),
('jimmy','ab2d'),('jimmy','abd3'),('jimmy','ab4d'),('jimmy','a6bd');
You can limit the number of queries, and order by "rand()" to get your desired result.
Perhaps if you tried something like this:
SELECT name, some_info
FROM test
WHERE name = 'tara'
ORDER BY rand()
LIMIT 1

Categories