i am using php and mysql...
i have application in which user enter any text and i want to fiind related data from database without using "LIKE" cause in my mysql query.
is there any possible way to search these string in database.
or any approach in mysql to do this....
Thanks in advance.
You can also check out MATCH clause.
You can use REGEXP, when user put single word you put WHERE field REGEXP '.*TEXT.*' in your query, regex is cool because you can allow user to put regular expression in search field.
If you don't want to use LIKE, and don't give a reason why (it seems fine for everyone else) then here is a solution that gets you araound it. (But it might not be the best real-world option...)
Whenever anything is added to the database that you want to be searched, take each word and break it into every possible combination of 1 or more consecutive letters.
E.g. for stack:
s, t, a, c, k, st, ta, ac, ck, sta, tac, ack, stac, tack, stack
Insert each of these into a table with an identifier that links to the original data.
Then you can match any search query against this list of words eactly (for full and partial matches). If your user is searching for multiple keywords, you split them in the front and and search for each, looking for matches to the same identifier.
Related
I'm creating a paraphrasing system, where a user inputs text and the system paraphrases for them.
My database looks like this:
KeyWord: dainty
Synonyms1: choice; delicious; tasty; juicy; luscious; palatable; savoury
Synonyms2: ethereal; beautiful; fragile; charming; petite; frail; elegant
where Keyword (varchar), Synonym1 (text), and Synomy2 (text) are database columns. The example above is one row of a database with 3 fields and their values.
This how it works if the system finds, for example, a word like tasty, it can be replaced by any of the words separated by a semicolon from either Synomyn1 or Synonym2 or the keyword because they are all synonyms.
Let me explain how the word search is working. The system first searches for the word in the Keyword column, if the word is not found, I go further and search for a word in the Synmon1 column and so on.
My Problem is checking the user's specific word in the Synonym1 or Synonym2 columns. When I use the LIKE clause, the generic way of searching from the database, the system is not searching for a full name, instead, it's searching for characters. For example, let's assume the writer's text is: "Benson has an ice cube", the system is assuming the ice was found in the choice. I don't want that, I want to search for a full word.
If anyone has understood me, please help to solve this.
If I understand your question, you want to search for ice in columns Synonyms1 and Synonyms2 but make sure you do not inadvertently match a word such as choice.
If you have ever read or heard anything on the subject of database normalization you would realize that your database does not even meet the requirements for 1NF (first normal form) becuase it has columns that consist of repeating values, which, as you have found out, makes searching inefficient and difficult. But let's move on:
A synonym column might just contain one word, so it might look like:
ethereal
Or:
ethereal; beautiful; fragile; charming; petite; frail; elegant
Thus the word you are looking for might be:
the entire column value
preceded by nothing and followed by a ;
preceded by a space and followed by a ;
preceded by a space and followed by nothing
So if your version of MySQL does not support regular expressions, then if you are looking for example the word ice in column Synonyms2, the WHERE clause should be:
WHERE (
Synonyms2 = 'ice'
OR
Synonyms2 like 'ice;%'
OR
Synonyms2 like '% ice;%'
OR
Synonyms2 like '% ice'
)
If you are running SQL 8+, then:
WHERE regexp_like(Synonyms2, '( |^)ice(;|$)')
This states that ice must be preceded by either a space or start of string and followd by either a ; or end of string.
This may be a newbie question, as I'm not an expert in SQL. However, couldn't find the answer using Google.
I have a table called record_fields which contains the majority of my system's content, which I want to search in. The content cell is defined as LONGTEXT as it can include extremely long input.
Originally, I used (simplifying the query a bit for clarity sake):
SELECT * FROM record_fields WHERE LOWER(content) LIKE LOWER('%{$keyword}%')
Execution time aside, this query has one major issue. If I search for the term "post" it will return all content which has words like "poster", "posting" and others. I wanted to add a FULLTEXT search.
Now the query looks like this (again, simplified):
SELECT * FROM record_fields WHERE MATCH (content) AGAINST ('{$keyword}')
However, this is still problematic. With MATCH, if my system's users search for the words "Bank of America", for example, all records that either have the word "Bank" and "America" will be returned.
TL;DR - my question is this:
how do I use MATCH to search for exact phrases with space in them?
Any help would be highly appreciated, thanks in advance!
%{keyword}% matches all text sub-strings that include your keyword anywhere in the string. MATCH usually takes all keywords in the match string as individual search terms, and matches against each. You can use boolean mode and use a + symbol before each required keyword. Take a look at the MySQL reference for this.
Edited the answer to reflect Idan's response in not getting the results from the suggested %keyword solution.
You can use Match Against With Boolean Mode and you can put your input string inside '"{$keyword}"'.
Check last example in below link
https://dev.mysql.com/doc/refman/5.5/en/fulltext-boolean.html
SELECT * FROM record_fields WHERE MATCH (content) AGAINST ('"{$keyword}"' IN BOOLEAN MODE )
I am building a search feature for the messages part of my site, and have a messages database with a little over 9,000,000 rows, and and index on the sender, subject, and message fields. I was hoping to use the LIKE mysql clause in my query, such as (ex)
SELECT sender, subject, message FROM Messages WHERE message LIKE '%EXAMPLE_QUERY%';
to retrieve results. unfortunately, MySQL doesn't use indexes when a leading wildcard is present , and this is necessary for the search query could appear anywhere in the message (this is how the wildcards work, no?). Queries are very very slow and I cannot use a full text index either, because of the annoying 50% rule (I just can't afford to rule that much out). Is there anyway (or even, any alternative to this) to optimize a query using like and two wildcards? Any help is appreciated.
You should either use full-text indexes (you said you can't), design a full-text search by yourself or offload the search from MySQL and use Sphinx/Lucene. For Lucene you can use Zend_Search_Lucene implementation from Zend Framework or use Solr.
Normal indexes in MySQL are B+Trees, and they can't be used if the starting of the string is not known (and this is the case when you have wildcard in the beginning)
Another option is to implement search on your own, using reference table. Split text in words and create table that contains word, record_id. Then in the search you split the query in words and search for each of the words in the reference table. In this way you are not limitting yourself to the beginning of the whole text, but only to the beginning of the given word (and you'll match the rest of the words anyway)
'%EXAMPLE_QUERY%'; is a very very bad idea .. am going to give you some
A. Avoid wildcards at the start of LIKE queries use 'EXAMPLE_QUERY%'; instead
B. Create Keywords where you can easily use MATCH
If you want to stick with using MySQL, you should use FULL TEXT indexes. Full text indexes index words in a text block. You can then search on word stems and return the results in order of relevance. So you can find the word "example" within a block of text, but you still can't search efficiently on "xampl" to find "example".
MySQL's full text search is not great, but it is functional.
http://dev.mysql.com/doc/refman/5.1/en/fulltext-search.html
select * from emp where ename like '%e';
gives emp_name that ends with letter e.
select * from emp where ename like 'A%';
gives emp_name that begins with letter a.
select * from emp where ename like '_a%';
gives emp_name in which second letter is a.
A group of people have been inconsistently entering data for a while.
Some people will enter this:
101mxeGte - TS 200-10
And other people will enter this
101mxeGte-TS-200-10
The sad thing is, those are supposed to be identical records.
They will also search inconsistently. If a record was entered one way, some people will search the other way.
Now, I know all about how you can fix data entry for the future, but that's NOT what I am asking about. I want to know how it is possible to:
Leave the data alone, but...
Search for the right thing.
Am I asking for the impossible here?
The best thing I found so far was a suggestion to simply muck about with the existing data, using the REPLACE function in mySQL.
I am uncomfortable with this option, as it means it will certainly actively piss off half of the users. The unfocused angst of all is less than the active ire of half.
The problem is that it has to go both ways:
Entering spaces in the query has to find both space and not-space entries,
and NOT entering spaces ALSO has to find both space and not-space entries.
Thanks for any help you can offer!
The "ideal" solution is pretty straightforward:
Decide what is the canonical way of representing a record
When someone saves a record, canonicalize it before saving
When someone searches for a record, canonicalize the input before searching for it
You could also write a small program to convert all existing data to the canonical form (you will have the code for it anyway, as "canonicalize" in steps 2 and 3 require that you write code that does so).
Edit: some specific information on how to canonicalize
With the sample data you give, the algorithm might be:
Replace all spaces with hyphens
Replace all runs of one or more hyphens with a single hyphen (a regex would be easiest for this -- actually, a regex can do both steps in one go)
Is there any practical problem with this approach?
Trim whitespaces from BOTH the existing data and the input of the search. That way the intended record(s) will always be returned. Hope your data size is small, though, because it's going to perform pretty poorly.
Edit: by "existing data" I meant "the query of existing data". My answer was based on assumption that the actual data could not be touched (which might not be correct).
If it where up to me, I'd have the data in the database updated with REPLACE, and on future searches when dealing with the given row remove all spaces in the input.
Presumably your users enter the search terms (or record details, when creating a record) in an HTML form, which then goes to a PHP script. It looks like your data can always be written in a way that contains no spaces, so why don't you do this:
Run a query that strips spaces from the existing data
Add code in the PHP script(s) that receives the form(s), so that it strips spaces from submitted data - whether that data is to be used for search or for writing new data.
Edit: I guess you would also need to change some spaces to hyphens. Shouldn't be too hard to write logic to accomplish that.
Something like this.
pseudo code:
$myinput = mysql_real_escape_string('101mxeGte-TS-200-10')
$query = " SELECT * FROM table1
WHERE REPLACE(REPLACE(f1, ' ', ''),'-','')
= REPLACE(REPLACE($myinput, ' ', ''),'-','') "
Alternatively you might write your own function to trim the data so it can be compared.
DELIMITER $$
CREATE FUNCTION myTrim(AStr varchar) RETURNS varchar
BEGIN
declare Result varchar;
SET Result = REPLACE(AStr, ' ','');
SET Result = ......
.....
RETURN Result;
END$$
DELIMITER ;
And then use this in your select
$query = " SELECT * FROM table1
WHERE MyTrim(f1) = MyTrim($myinput) "
have you ever heard of SQL's LIKE?
http://dev.mysql.com/doc/refman/4.1/en/string-comparison-functions.html
there's also regex
http://dev.mysql.com/doc/refman/4.1/en/regexp.html#operator_regexp
101mxeGte - TS 200-10
101mxeGte-TS-200-10
how about this?
SELECT 'justalnums' REGEXP '101mxeGte[[:blank:]]*(\-[[:blank:]]*)?TS[[:blank:]-]*200[[:blank:]-]*10'
digits can be represented by [0-9] and alphas as [a-z] or [A-Z] or [a-zA-Z]
append a + to make then multiple of that. perens allow you to group and even capture what is in the perens and reuse it later in a replace or something else.
RLIKE is the same as REGEXP.
This question is a chalenge for me, my friend can`t tell me how to do it, but he is really good programmer (I think).
Users can put into database sentences. When user puts a sentence it is saved in sentences table.
Next, sentence is split into words, each soundex of the word is saved into table tags with id of the splited sentence.
Last, each soundax of the word is put into weights table, if there arleady is the same soundex, function adds 1 to counter of this soundex.
(For those who dont know: soundex is a function that returns a phonetic representation (the way it sounds) of a string)
Structure of the database:
One table sentences contains two rows: id and sentence.
Other table tags contains id (with is id of a sentence) and tag (with is one word from the sentence).
tag isn't really just plain word, but soundex of this word.
Last table weights contains tag and weight (with is number, it tells us how many there is tags like this in table tags)
My question is: how can I make a function witch returns similar sentences to given string.
It should use tags (soundex of word) and each tag should have its own power based on weights table.
Tags, that are often used are more important, then more original tags. Can it be done in just one mysql query?
Next question: I think that this way of looking for similar sentences is good, but what with speed of this function?
I need to use it very very often in my site.
Well instead of having a weights table, why don't you have a table that relates tags to sentences? So have a table called sentence_tags with a sentence_id and a tag_id column. Then you can compute the weights by doing a join on those two tables, and still reference back to the sentence that contains the tag. You may as well store both the tag and the soundex in the tags table, while you're at it.
Perhaps the Levenshtein Distance is what you are looking for. It calculates the number of steps there are needed to transfer from one word to another.
Do realize this is a costly operation.
Joe K's suggestion seems spot on for good database design.
Do not store information that can be extrapolated.
Meaning, use the join statement and PHP to calculate the weight at run-time.
I understand this may not be the correct solution in your design, but often a few moments spent on smart database struture design will make everything work that much better.