I have a table in a database with a structure like this
Keywords
id int(11)
U_id int(11)
keywords text
create_date int(11)
U_id is a foreign key, id is the primary key
The keywords field is a list of words created by users separated by commas
I was wondering if someone could suggest an efficient query to search such a table.
You should change your database design so that you have a table called user_keyword and store each keyword in a separate row. You can then index this table and search it easily and efficiently:
WHERE keyword = 'foo'
If you can't modify the database then you can use FIND_IN_SET but it won't be very efficient:
WHERE FIND_IN_SET('foo', keywords)
Separate keywords in its own table, "connect" it to the old table via FOREIGN KEY, index it and you'll be able to search for exact keywords of keyword prefixes efficiently.
For example:
id U_id keywords create_date
1 - A,B,C -
Becomes:
PARENT_TABLE:
id U_id create_date
1 - -
CHILD_TABLE:
id keyword
1 A
1 B
1 C
Provided there is an index on keyword, the following query should be efficient:
SELECT * FROM PARENT_TABLE
WHERE id IN (SELECT id FROM CHILD_TABLE WHERE keyword = ...)
---EDIT---
Based on Johan's comments below, it appears that InnoDB uses what is known as "index-organized tables" under Oracle or "clusters" under most other databases. Provided you don't need to query "from parent to child" (i.e. "give me all keywords for given id"), the PRIMARY KEY on CHILD_TABLE should be:
{keyword, id}
Since the keyword is the first field in the composite index, WHERE keyword = ... (or WHERE keyword LIKE 'prefix%') can use this index directly.
If you using MyISAM, you can create a fulltext index on field keywords. Then search using:
select * from keywords k where match('test') against(k.keywords);
Of course CSV in a database is just about the worst thing you can do. You should put keywords in a separate table. Make sure to use InnoDB for all tables.
Table tags
-------------
id integer auto_increment primary key
keyword_id integer foreign key references keywords(id)
keyword varchar(40)
Now you can select using:
SELECT k.* FROM keywords k
INNER JOIN tags t ON (t.keyword_id = k.id)
WHERE t.keyword LIKE 'test' //case insensitive comparison.
Much much faster than CSV.
You have 2 options :
re-structor your database, create an extra table called Keywords, and that should include a U_id which will be a foreign key mapped to your user table, and that way you can easily insert each keyword into the Keywords table and then search it using something :
SELECT * FROM Keywords WHERE keyword LIKE %KEYWORD%
you can get the keywords field, seperate the keywords and put them into an array using your preferred language and then search the array.
Related
I am coding a tiny search engine for my practice. I want to add up search functionality in it. I am trying to select all rows of questions table upon matching title, description and keywords.
I created the following 3 tables:
questions(id(PK), title, description)
keywords(id(PK), label);
questions_keywords(id(PK), question_id(FK), keyword_id(FK));
So far my SQL query looks like this:
SELECT q.* FROM question_keywords qk
JOIN keywords k ON qk.keyword_id=k.id
JOIN questions q ON qk.question_id=q.id
WHERE q.description LIKE '%javascript%'
OR
k.keyword_label LIKE '%java%'
In this query, i am selecting all the rows from questions table containing the substring java or javascript
Am I doing it right or there is a better way to do it??
Thanks in advance.
AS others mentioned I would add distinct. I would also reorder the tables. Functionally I don't think it matters it just bugged me... ha ha
SELECT DISTINCT
q.*
FROM
questions AS q
JOIN
question_keywords AS qk ON q.id = qk.question_id
JOIN
keywords AS k ON qk.keyword_id = k.id
WHERE
q.description LIKE '%javascript%'
OR
k.label LIKE '%java%';
As you can see in this DBfiddle
https://www.db-fiddle.com/f/pcVqcMm1yUoU6NdSHitCVr/2
The reason you get duplicates is basically called a Cartesian product
https://en.wikipedia.org/wiki/Cartesian_product
In simple terms is just a consequence of having a "Many to Many" relationship.
If you see in the fiddle I intentionally created this situation by what I added to the Bridge ( or Junction ) table question_keywords in the last 2 Inserts
INSERT INTO question_keywords (question_id,keyword_id)VALUES(4,1);
INSERT INTO question_keywords (question_id,keyword_id)VALUES(4,2);
The duplicate row, is simply because there are 2 entries for this table with the matching value of 4 for question_id. So these are only Duplicates in the sense that we are only selecting the fields form the questions table. If we included fields from the keywords table. Then one row would have a keyword or Java #1 while the other would have Javascript #2 as the keyword.
Hope that helps explain it.
A few other things to note:
You have a syntax error in the query you posted k.keyword_label LIKE '%java%' should be k.label LIKE '%java%' according to your table definition in the question.
Typically the Junction table should be a combination of both tables it joins ( which you almost did ) but the pluralization is wrong question_keywords should be questions_keywords it's a small thing but it could cause confusion when writing queries.
There is really not a need for a separate primary key for the Junction table.
If you notice how I created the table in the fiddle.
CREATE TABLE question_keywords(
question_id INT(10) UNSIGNED NOT NULL,
keyword_id INT(10) UNSIGNED NOT NULL,
PRIMARY KEY(question_id,keyword_id)
);
The primary key is a compound of the 2 foreign keys. This has the added benefit of preventing real duplicate rows from being made. For example if you tried this
INSERT INTO question_keywords (question_id,keyword_id)VALUES(4,1);
INSERT INTO question_keywords (question_id,keyword_id)VALUES(4,1);
With the setup I have it would be impossible to create the duplicate. You can still have a separate primary key (surrogate key), but you should create a compound unique index on those 2 keys in place of it.
Hi I want a select query in mysql. I have a table name delivery_area and have one column name start_with_zip_postal. column have value with comma separated like 100,101 and now i want to search a result I have a postal code 101245 now i want to search if start_with_zip_postal have any matches string of 101245
now one row will retrieve because column have 101 value in mysql
for better under standing please check my table structure in screenshot
you can use find_in_set like below query :
select * from deleivery_areas where find_in_set(substring('101245',1,3),start_with_zip_postal);
You can use FIND_IN_SET instead of LIKE.
SELECT * FROM delivery_areas WHERE FIND_IN_SET(LEFT('101245',3), start_with_zip_postal);
To search in comma separated values in MySQL we can to use FIND_IN_SET.
You need to normalize your data so that each prefix is in a different row. Create a table delivery_area_zip_prefix
CREATE TABLE delivery_area_zip_prefix (
id INT AUTO_INCREMENT PRIMARY KEY,
delivery_area_id INT,
starts_with_zip_postal VARCHAR(16),
CONSTRAINT FOREIGN KEY (delivery_area_id) REFERENCES delivery_area (id)
);
Then you can do:
SELECT d.*
FROM delivery_area AS d
JOIN delivery_area_zip_prefix AS p ON d.id = p.delivery_area_id
WHERE LOCATE('101245', p.starts_with_zip_postal) == 1
I'm in the midsts of constructing some database tables, but a possible search issue has just come to mind.
The two tables in question are Genres, a 2 column table holding a list of music genres identified by an ID field, i.e. 1 = Dance, 2 = Rock, and so on. And a Music table, a multi column table with Title, Artist, and Genre_ID fields. And yes you've guest it, Genre_ID refers to the ID of the Genre table.
My question is, if I have a search box on the site powered by PHP, and that search box queries the key fields, so Title, Artist, and Genre to yeld the best result, how can I get that to function correctly in a search, when the Genre name itself is in a separate table, and not in the Music table.
An example search would be, "rock music by ACDC".
To connect multiple tables in a query, you should look at using "join" statements. Rather than reinventing the wheel, the first answer to this post does a good job of explaining them... When to use a left outer join
Create a view where you join both of the tables. Then use SELECT with LIKE in WHERE clause or better use a fulltext search to do the searching job.
The view
create view ViewMusicWithGenre as
select "*"
from Music as m
left join Genre as g on m.genre_id = g.id;
Search option with like
select "*"
from ViewMusicWithGenre
where Title like '%<what_you_search>%'
or Artist like '%<what_you_search>%'
or Genre like '%<what_you_search>%';
I wrote the asterisk in "" because I KNOW that you WILL NOT use an asterisk.
Left join is there because you want the row even without specified genre (very likely).
The fulltext search
This usually depends on the database you use. This is for instance Microsoft SQL Server 2014:
Fulltext search - http://technet.microsoft.com/en-us/library/ms142571.aspx
Fulltext index - http://technet.microsoft.com/en-us/library/ms187317.aspx
Querying fulltext search - http://technet.microsoft.com/en-us/library/ms142583.aspx
EDIT: for MySQL database
MySQL does not support fulltext indeces on views. So you are left with couple of choices:
use the LIKE statement - could be ineffective, also more work later on
create the fulltext index on Music table and omit the genre - not good enough
create a new table that resembles the join and fill it on say daily basis with a job (or something like that) a do the fulltext search on that table - best solution in long terms, but more work to begin with and includes data duplicity
You also have to bear in mind that fulltext indeces only work on MyISAM storage engine.
The create statement for the joint table
create table fulltextSearchTable (
Music_ID int not null primary key,
Music_Title varchar(1024) not null,
Music_Artist varchar(1024) not null,
Genre_ID int not null,
Genre_Title varchar(1024) not null,
fulltext(Music_Title, Music_Artist, Genre_Title)
) engine=MyISAM;
The select with fulltext search
select "*"
from fulltextSearchTable
where match(Music_Title, Music_Artist, Genre_Title) against ('your_keyword');
You can try INNER JOIN like this:
$result=mysqli_query($YourConnection,"SELECT music.title, music.artist FROM music
INNER JOIN genres ON music.genre_id=genres.genre_id
WHERE music.title LIKE '$searchword'
OR music.artist LIKE '$searchword'
OR genres.genre LIKE '$searchword'");
And then print the results like this:
while($row=mysqli_fetch_array($result)){
echo $row['title']." - ".$row['artist']."<br>";
}
I have this database table:
Column Type
source text
news_id int(12)
heading text
body text
source_url tinytext
time timestamp
news_pic char(100)
location char(128)
tags text
time_created timestamp
hits int(10)
Now I was searching for an algorithm or tool to perform a search for a keyword in this table which contains news data. Keyword should be searched in heading,body,tags and number of hits on the news to give best results.
MySQL already has the tool you need built-in: full-text search. I'm going to assume you know how to interact with MySQL using PHP. If not, look into that first. Anyway ...
1) Add full-text indexes to the fields you want to search:
alter table TABLE_NAME add fulltext(heading);
alter table TABLE_NAME add fulltext(body);
alter table TABLE_NAME add fulltext(tags);
2) Use a match ... against statement to perform a full-text search:
select * from TABLE_NAME where match(heading, body, tags, hits) against ('SEARCH_STRING');
Obviously, substitute your table's name for TABLE_NAME and your search string for SEARCH_STRING in these examples.
I don't see why you'd want to search the number of hits, as it's just an integer. You could sort by number of hits, however, by adding an order clause to your query:
select * from TABLE_NAME where match(heading, body, tags, hits) against ('SEARCH_STRING') order by hits desc;
I've got a database of games with a genre field that has unique ids in it that are separated by commas. It's a text field. For example: HALO 2 - cat_genre => '1,2' (Action,Sci-Fi)
I'm trying to make a function that calculates the total number of games in that genre. So it's matching 1 value to multiple values separated by commas in the db.
I was using SELECT * FROM gh_game WHERE cat_genre IN (1) which would find Action Games.
I'm trying to work this out, I thought I had nailed it once before but I just can't figure it out.
You need to create a many to many relation. like so
CREATE TABLE gameGenreTable ( id int NOT NULL PRIMARY KEY, genreID, gameID)
EDIT: if you're using InnoDB you can also create foreign keys on genreID and gameID..
I would add a UNIQUE key on genreID, gameID
then you can do a query like this
SELECT genreID,count(genreID) as count from gameGenreTable GROUP BY genreID;
-- and join in your other table to get the genre name (or just use the ID).