I want to build up a database where there will be about 400 strings. I want to make the database searchable.
The structure of the database will be like:
Brand | model |additional products | price | search words | (this is 1 string, there will about 400 strings)
There will be between 2 and 50 search words on each string. The search are done by clicking a checkbox and the marked checkboxes words will be searched for in the database.
My question is how is the best way to index all the search words?
I’m thinking of 2 ways:
In the field search words, all searchable words will be displayed like: 1GB RAM, 512GB RAM, ATA, SATA… and so on for each string. This means that ALL words will be in the same raw on a specific string separated by “,”.
Each search word will have its own row like: | search words 1| search words 2| search words 3 | search 4 words 5|….. and so on. In |search words 1| the word 1GB RAM will be. In | search words 2| the word 512GB RAM will be and so on… This means in a string maybe half the search words row will be filled with a search word.
In option 2 there will be more than 50 rows in the database and all search words in different column (1 in each column for each product). In option 1 there will be 1 row with all words in the same column for each product.
Or is there a better way to do this?
Even though another answer was accepted... I explained this idea a little more because I feel that it meets "best practices" and allows you to associate more than one word with one item while not repeating data.
You should end up with three tables:
item: item_id | Brand | model |additional products | price
word: word_id | word
item_word: item_word_id | item_id | word_id
the data would look like:
Item:
item_id brand model additional_products price
1 nokia g5 100
2 toshiba satellite 1000
word:
word_id word
1 1 GB
2 ATA
3 SATA
4 512BG RAM
item_word:
item_word_id itwm_id word_id
1 1 1
2 1 2
3 2 3
4 2 4
so that nokia had these words: 1 GB, ATA and toshiba had these words: SATA, 512BG RAM. (I realize this doesn't make much sense, it's just an example)
then query it like..
select item.*, word
from item
join item_word on item.item_id = item_word.item_id
join word on item_word.word_id = word.word_id
and filter it like...
select item.*, word
from item
join item_word on item.item_id = item_word.item_id
join word on item_word.word_id = word.word_id
where word in ('1GB RAM', '512GB RAM', 'ATA')
to see what is the most relevant result you could even try...
select item.item_id, item.brand, item.model, count(*) as word_count
from item
join item_word on item.item_id = item_word.item_id
join word on item_word.word_id = word.word_id
where word in ('1GB RAM', '512GB RAM', 'ATA')
group by item.item_id, item.brand, item.model
order by count(*) desc
for something that matches all the words provided, you would use...
select item.item_id, item.brand, item.model, count(*) as word_count
from item
join item_word on item.item_id = item_word.item_id
join word on item_word.word_id = word.word_id
where word in ('1GB RAM', 'ATA')
group by item.item_id, item.brand, item.model
having count(*)=2
where 3 is the number of words in your in statement... word in ('1GB RAM', 'ATA'). in this case it was 2.
if you just do...
item: Brand | model |additional products | price | long_word_string
then you have to do...
select *
from item
where long_word_string like '1GB RAM' or word like 'ATA'
or even...
select *
from item
where long_word_string regexp '1GB RAM|ATA'
but those are very inefficient/costly methods... and it is better to just normalize things so you're not storing extra data and killing performance trying to get it out
does that make sense? does it answer your question?
edit: my answer lost out to just two tables... i'm concerned for OP's database now.
Storing your search terms in never-ending additional columns is counter-intuitive to database "normalization". Storing everything in one column is usually the last option since it is much easier to break down search terms if you use multiple columns.
Make a separate table and join your original table to this table. Your structure would look something like this:
Original table
New table
I added a primary key column to your original table. This will make the JOIN easier. Use the following statement to join the two tables:
SELECT original_table.*
FROM original_table AS ABB2
JOIN new_table AS ABB1 ON ABB1.product_id = ABB2.id
WHERE search_word = "your search term"
The "search_word" column in the new table are the terms associated with each of your entries in your original table.
You can add "%" wildcards to your WHERE statement if you'd want fuzzy (return all results that contain your search term) search enabled.
Thanks for all the suggestions. IT was very helpfull. I think I will try go for the seperated table for key Words, but im not sure how to code this part, so will have start learn that too :)
Related
I have this very specific problem which I can't even decide how to approach. So I have 3 tables in MySQL.
Table recipe: id_recipe| name | text | picture
Table ingredients_recipe: id_rs | id_recipe| id_ingredients
Table ingredients: id_ingredient | name | picutre
This is a site, where you select ingredients(so the input is 1 or more id_ingredient) and it should display three categories:
All recipes you can make right now (you have all the ingredients required for it)
All recipes where you are missing only 1 or 2 ingredients
All recipes where you are missing only 3 or 4 ingredients.
Can you help me with these 3 SQL selects? I'm pretty deadlocked right now. Thanks.
SAMPLE DATA: http://pastebin.com/aTC5kQJi
I think your basic statement is already on the right track. You just need to do a little trick. You cannot compare them directly, but you can compare the count of ingredients:
SELECT id_receipe, count(id_rs) as ingredient_count
FROM ingredients_recipe
WHERE id_ingredient IN ( 2, 5)
GROUP BY id_recipe
This will give you the count of ingredients you have for each receipe. Now get the total amount of ingredients for each receipe
SELECT id_receipe, count(id_rs) as ingredient_count
FROM ingredients_recipe
GROUP BY id_recipe
an compare them. Taking the first query as a basis. You can easily get your three different categories out of this.
my MySQL table is in this structure:
|id|title|duration|thumb|videoid|tags|category|views
|1||Video Name|300|thumb1.jpg|134|tag1|tag2|tag3|category|15
|2||Video Name2|300|thumb2.jpg|1135|tag2|tag3|tag4|category|10
Table contains about 317k rows.
Query is:
SELECT id,title,thumb FROM videos WHERE tags LIKE '%$keyword%' or title LIKE '%$keyword%' order by id desc limit 20
And this is taking 0.8s to 3s to load results.
Im new in php/mysql, how can I speed up these queries, suggestions please, thank you.
The only other suggestion I can throw in is to have a multi-part index of
( tags, title, id )
This way, it can utilize the index to qualify the WHERE clause criteria for both tags and title, and have the ID for the order by clause without having to go back to the raw data pages. Then, when records ARE found, only for those entries does it need to actually retrieve the raw data pages for the other columns associated with the row.
You are using this search construct:
column LIKE '%$keyword%'
The leading % wildcard character definitely defeats the use of indexes to do these searches. How to cure this terrible performance problem? You could use FULLTEXT search, about which you can read. Or, you could try to organize your tables so
column LIKE 'keyword%'
will find what you need, and then index the columns being searched. To do this, you would create a tag table, with a name and id for each distinct tag. This table will have a primary key on the id, and a unique key on the tag. E.g.
tag_id | tag
1 | drama
2 | comedy
3 | horror
4 | historical
The you would create another table, known in the trade as a join table, with two ids in it. The primary key of this table is a composite of the two columns. You also need a non-unique index on the tag_id field.
video_id | tag_id
1 | 1
1 | 4
This sample data gives video with id = 1 the tags "drama" and "historical."
Then to match tags you need
SELECT v.id, v.title, v.thumb
FROM video AS v
JOIN tag_video AS tv ON v.id = tv.video_id
JOIN tag AS t ON tv.tag_id = t.tag_id
WHERE t.tag IN ('drama', 'comedy')
This will look up your tags very fast, and let you look up multiple ones in a single query if you wish.
It won't help with your requirement for full text search on your titles, however.
EDITED:
define indexes on title and keyword fields.
try this:
ALTER TABLE `videos` ADD INDEX (`title`);
ALTER TABLE `videos` ADD INDEX (`keyword`);
I have two tables in the database, parts, and products.
I have a column in the products table with strings of ids (comma separated). Those ids match ids of the parts table.
**parts**
ID | description (I'm searching this part)
-------------------------------
1 | some text here
2 | some different text here
3 | ect...
**products**
ID | parts-list
--------------------------------
1 | 1,2,3
2 | 2,3
3 | 1,2
I'm really struggling with the SQL query on this one.
I've done the 1st part, got the id's from the parts table
SELECT * FROM parts WHERE description LIKE '%{$search}%'
The biggest problem is the comma separated structure of the the description column.
Obviously, I could do it in PHP, create an array of the the results from the parts table, use that to search the products table for id's, and then use those results to grab the row data from the parts table (again). Not very efficient.
I also tried this, but I'm obviously trying to compare two arrays here, not sure how this should be done.
SELECT * FROM `products` WHERE
CONCAT(',', description, ',')
IN (SELECT `id` FROM `parts` WHERE `description` LIKE '%{$search}%')
Can anybody help?
I would perhaps try a combination of LOCATE() and SUBSTR(). I work mainly in MSSQL which has CHARINDEX() that I think works like MySQL's LOCATE(). It is bound to be messy. Are there a variable number of elements in the parts-list field?
I am quite new to PHP and MySQL, but have experience of VBA and C++. In short, I am trying to count the occurrences of a value (text string), which can appear in 11 columns in my table.
I think I will need to populate a single-dimensional array from this table, but the table has 14 columns (named 'player1' to 'player14'). I want each of these 'players' to be entered into the one-dimensional array (if not NULL), before proceeding to the next row.
I know there is the SELECT DISTINCT statement in MySQL, but can I use this to count distinct occurrences across 14 columns?
For background, I am building a football results database, where player1 to player14 are the starting 11 (and 3 subs), and my PHP code will count the number of times a player has made an appearance.
Thanks for all your help!
Matt.
Rethink your database schema. Try this:
Table players:
player_id
name
Table games:
game_id
Table appearances:
appearance_id
player_id
game_id
This reduces the amount of duplicate data. Read up on normalization. It allows you to do a simple select count(*) from appearances inner join players on player_id where name='Joe Schmoe'
First of all, the database schema you're using is terrible, and you just found out a reason why.
That being said, I see no other way then to first get a list of all players by distinctly selecting the names of players into an array. Before each insertion, you would have to check if the name is already in the array (if it is already in, don't add it again).
Then, when you have the list of names, you would have to run an SQL statement for each player, adding up the number of occurences, like so:
SELECT COUNT(*)
FROM <Table>
WHERE player1=? OR player2=? OR player3=? OR ... OR player14 = ?
That is all pretty complicated, and as I said, you should really change your database schema.
This sounds like a job for fetch_assoc (http://php.net/manual/de/mysqli-result.fetch-assoc.php).
If you use mysqli, you would get each row as an associative array.
On the other hand the table design seems a bit flawed, as suggested before.
If you had on table team with team name and what not and one table player with player names.
TEAM
| id | name | founded | foo |
PLAYER
| id | team_id | name | bar |
With that structure you could add 14 players, which point at the same team and by joining the two tables, extract the players that match your search.
This is the first time I am writing an actual search feature for my database.
The database consists of hotel names, hotel food items, hotel locations.
I would like the above three to show up during a search of a string.
Are there any common search algorithm or packages that can be used ?
EXPECTED RESULT SET:
id | name | description | table_name | rank
56 | KFC| Fried chicken | hotel | 1
12 | [food item name] | [food item description] | food_item | 2
19 | [hotel name] | [hotel description] | hotel | 3
....
Do you mean a relational database? If yes, your "search" algorithm is a WHERE clause.
Do you mean contextual search? Lucene is a great search engine implementation written in Java. This might help you marry it with Lucene:
http://www.cabotsolutions.com/2009/05/using-solr-lucene-for-full-text-search-with-mysql-db/
The answer is far more complicated if you're thinking about crawling web sites based on some criteria. Please clarify.
If you are using Microsoft SQL Server, FreeText works very well:
http://msdn.microsoft.com/en-us/library/ms176078.aspx
Let's consider you're using mysql.
Well your question is basically: how to write a query that will search hotel name, food items, and hotel location.
I guess theses 3 informations are stored in 3 different tables. The easiest way would be to simply query the 3 tables one after the other with query like theses:
SELECT * FROM hotel WHERE hotel_name LIKE "%foobar%";
SELECT * FROM hotel_food_item WHERE item_name LIKE "%foobar%";
SELECT * FROM hotel_location WHERE hotel_name LIKE "%foobar%" OR street_name LIKE "%foobar%" OR city LIKE "%foobar%";
Make sure your search term are safe from SQL injection
You may (or not) want to group the query into 1 bigger query
If your database is becoming large ( like < 100 000 line per table ), or if you have a lot or search query, you might be interested in creating a search index, or use a dedicated database intend for text search, like elastic search or something else.
Edit:
If relevance is a matter, use MATCH AGAINST:
http://maisonbisson.com/blog/post/10752/making-mysql-do-relevance-ranked-full-text-searches/
http://www.devshed.com/c/a/PHP/Using-Relevance-Rankings-for-Full-Text-and-Boolean-Searches-with-MySQL/
PHP MySQL Search And Order By Relevancy
You'll have to create 3 subqueries that do MATCH AGAINST, and them compile them together. You can do AGAINST("foobar") as rank so you'll have the score you needed.
This should look like:
SELECT *
FROM
(
SELECT id, 'hotel' as table_name, MATCH (search_field1) AGAINST ("lorem") as rank FROM tableA
UNION
SELECT id, 'food' as table_name, MATCH (search_field2) AGAINST ("lorem") as rank FROM tableB
) as res
ORDER BY res.rank DESC
if you are not using innodb table, and instead are using myisam, you can use mysql's built in full text search.
this works by first putting a full-text index on the columns you wish to search, and then creating a query that looks roughly like this:
SELECT *, MATCH(column_to_search) AGAINST($search_string) AS relevance
FROM your_table
WHERE MATCH(keywords) AGAINST($search_string IN BOOLEAN MODE)
ORDER BY relevance
LIMIT 20