it's been a while since I have made one of my "Converting MySQL to PostgreSQL" posts.
So, today's problem is as follows:
The original MySQL queries involve where clauses which look a bit like the following:
WHERE id LIKE '%6%'
OR createdtime LIKE '%6%'
OR modifiedtime LIKE '%6%'
OR start_date LIKE '%6%'
OR end_date LIKE '%6%'
OR sc_related_to LIKE '%6%'
OR tracking_unit LIKE '%6%'
OR message LIKE '%6%'
This query is part of a system wide search. In this case, the system is searching for 6, had I asked it to search for something else, like say the word user, instead of %6%, we'd have %user%.
Now, the problem is that, the above data-types are not always strings. Integer fields like id and date/time fields like createdtime are being compared to strings. In MySQL, this seems to be okay, but grumpy PostgreSQL gets grumpy when it sees this query.
I know that for some fields, I can use the to_char function, so, for example, part of the clause in PostgreSQL might look like this:
to_char(id, '999') LIKE '%6%'
Unfortunately, I can't just go throught the queries and add the to_char to each applicable field because of the PHP backend. This is what the PHP code for generating the WHERE clause looks like:
$where .= $tablename.".".$columnname." LIKE '". formatForSqlLike($search_val) ."'";
Note: It's part of a loop, so the above line generates all of the individual comparisons.
So, even if I can get around the type comparison with to_char, I can't implement it because to_char might need a specific 2nd parameter for a different data-type and even if I could use the same parameter for all data-types, some of the data-types would be strings, and passing a string to to_char throws an error.
So, I need a way to either get PHP to determine the column type and use the right to_char(or don't use it at all) accordingly or I need to get PostgreSQL to compare different data-types.
Thanks for all of your help, have a good day!
You could just cast everything to a string, such as:
$where .= $tablename.".".$columnname."::text LIKE '". formatForSqlLike($search_val) ."'";
If you don't want to change the PHP code, another idea is to create a VIEW that does the cast, such as:
CREATE OR REPLACE VIEW TableName AS
SELECT
createdtime::text,
modifiedtime::text,
start_date::text,
end_date::text,
sc_related_to::text,
tracking_unit::text,
message::text
FROM RealTable;
Then your PHP code would just use this view instead of the table, and see everything as text.
However, I believe your technique here is really not going to scale well. Every search will probably be doing a sequential scan of the entire table, since LIKE clauses cannot be indexed.
Your best bet is to re-design this using a FULLTEXT search index, which will be way more versatile as well as lightning fast.
Related
I have the following MySQL query:
SELECT * FROM users WHERE name LIKE '%jack%'
I want it to order by how much it's like jack so
jack
jacker
majack
I can also use PHP.
I think you can accomplish what you want by using full text search function in mysql. Your query will be like this:
SELECT name, MATCH (name) AGAINST ('jack') as score
FROM users ORDER BY score DESC;
There are some conditions you need to take into consideration when using full search text:
The mysql engine should support this functions. I think InnoDB 5.6 and MYISAM 5.5 support that.
You have to add a FULLTEXT index in your table definition.
You can see a working demo here: http://sqlfiddle.com/#!9/72bf5/1
More info about full search text in MySQL here: http://dev.mysql.com/doc/refman/5.0/en/fulltext-natural-language.html
Also, here is a working example I wrote on PHP using similar_text and sorting array functions: http://ideone.com/UQpBFk
Hope it helps!
I don't think you can do this kind of thing with plain SQL. The first thing you need to do is define what it means for two strings to be similar, for which there are various metrics. You should probably pick one of those and write a stored procedure to sift through the data.
As every one mentioned, you can't achieve this using normal way other than using full text search but if you know all the different pattern before hand then you can use FIELD function to achieve something which resemble a approx result like
SELECT * FROM users WHERE name LIKE '%jack%'
ORDER BY FIELD (name,'jack','jacker','majack')
Try something like this:
$query = mysql_query("SELECT * FROM users WHERE name LIKE '%jack%' ORDER BY name ASC ");
while($fetchdata=mysql_fetch_array($query))
{
echo $fetchdata["name"] ;
}
Well I'm having a problem mainly caused by bad structure in database. I'm coding this for a company whose code is quite messy and the table is very large so I don't think it's an option to fix the structure.
Anyway, my issue is that I'm trying to somehow group a value that won't be alone in the string...
They are storing values separated with commas... So it would be like
field: "category" value: 'var1, var2, var3'
And I will search using this query:
SELECT name, category
FROM companies
WHERE (MATCH(name, category) AGAINST ('$search' IN BOOLEAN MODE)
OR category LIKE '$search%')
It would match with for example var2 (it's not limited to 3 variables though, can be solo or many more) and I'd split it manually in PHP, no problem. Although I will not get enough matches, I want e.g. 10 matches by different searches. To be more specific I'm making an autosuggest feature, which means I will for example want to match "moto%" with motorbike, motor alone or whatever but I keep getting the same values, like there'd be a couple of 100 of results that contains "motorbike" and I don't know how to filter them, as I'm not able to use GROUP BY due to bad db structure...
I found this: T-SQL - GROUP BY with LIKE - is this possible?
It SEEMED as something that would be a solution, but as far as I've tried I could not get it work with what I wanted.
So I'm wondering which solutions there are... If there are ABSOLUTELY no way of working this around I might probably have to fix the db structure (but this really has to be the last option)
Start taking steps to make database structure proper. Make an extra table and fill it with split values.
Then you can use proper queries to select the data you need. Both you and next developer will have less troubles with this project in the future, not mentioning queries speed gain.
I am not sure why i cannot write a comment, but maybe you can try this:
SELECT name, category FROM companies WHERE category LIKE '$search%' or LOCATE('search', category)>0;
That would look if in category appears any of your 'search' value.
I would have to agree that you should make the database right. It'll save you much trouble and time later. However, using SELECT DISTINCT may fix your immediate issue.
I have a MySQL table storing some user generated content. For each piece of content, I have a title (VARCHAR 255) and a description (TEXT) column.
When a user is viewing a record, I want to find other records that are 'similar' to it, based on the title/description being similar.
What's the best way to go about doing this? I'm using PHP and MySQL.
My initial ideas are:
1) Either to strip out common words from the title and description to be left with 'unique' keywords, and then find other records which share those keywords.
E.g in the sentence: "Bob woke up at 5 am and went to school", the keywords would be: "Bob, woke, 5, went, school". Then if there's another record whose title talks about 'bob' and 'school', they would be considered 'similar'.
2) Or to use MySQL's full text search, though I don't know if this would be any good for something like this?
Which method would be better out of the two, or is there another method which is even better?
I'll keep this short (it could be way too long)...
I would not select they keywords 'manually' or modify your original data.
MySQL supports full text search with MyISAM (not InnoDB) engine. A full description of the options available when querying the DB are available here. The query can automatically get rid of common stop-words and words too common in the data set (more than 50% of the rows contains them) depending on the querying method. Query expansion is also available and the query type should be decided depending on your needs.
Consider also using a separate engine like Lucene. With Lucene you will probably have more functionalities and better indexing/searching. You can automatically get rid of common words (they get a low score and do not influence the search) and use things as stemming for instance. There is a little bit of a learning curve but I'll definitely look into it.
EDIT:
The MySQL 'full-text natural language search' returns the most similar rows (and their relevance score) and is not a boolean matching search.
You would start by defining what similar means to you and how you want to score the similarity between two different documents.
Using that algorithm you can processing all your documents and build a table of similarity scores.
Depending on the complexity of your scoring algorithm and size of data set, this may not be something you would run realtime, but instead batch it through something like Hadoop.
I have done something like this. I replace all of the spaces in the string with % then use LIKE in the where clause. Here, I will give you my code. It is from MSSQL but minor adjustments can be made to work it with MySQL. Hope it helps.
CREATE FUNCTION [dbo].[fss_MakeTextSearchable] (#text NVARCHAR(MAX)) RETURNS NVARCHAR(MAX)
--replaces spaces with wildcard characters to return more matches in a LIKE condition
-- for example:
-- #text = 'my file' will return '%my%file%'
-- SELECT WHERE 'my project files' like #text would return true
AS
BEGIN
DECLARE #searchableText NVARCHAR(MAX)
SELECT #searchableText = '%' + replace(#text, ' ', '%') + '%'
RETURN #searchableText
END
Then use the function like this:
SELECT #searchString = dbo.fss_MakeTextSearchable(#String)
Then in your query:
Select * from Table where title LIKE #searchString
I have a PHP interface with a keyword search, working off a DB(MySQL) which has a Keywords field.
The way in which the keywords field is set up is as follows, it is a varchar with all the words formatted as shown below...
the, there, theyre, their, thermal etc...
if i want to just return the exact word 'the' from the search how would this be achieved?
I have tried using 'the%' and '%the' in the PHP and it fails to work by not returning all of the rows where the keyword appears in.
is there a better (more accurate) way to go about this?
Thanks
If you want to select the rows that have exactly the keyword the:
SELECT * FROM table WHERE keyword='the'
If you want to select the rows that have the keyword the anywhere in them:
SELECT * FROM table WHERE keyword LIKE '%the%'
If you want to select the rows that start with the keyword the:
SELECT * FROM table WHERE keyword LIKE 'the%'
If you want to select the rows that end with the keyword the:
SELECT * FROM table WHERE keyword LIKE '%the'
Try this
SELECT * FROM tablename
WHERE fieldname REGEXP '[[:<:]]test[[:>:]]'
[[:<:]] and [[:>:]] are markers for word boundaries.
MySQL Regular Expressions
if you also search for the commas, you can be sure you are getting the whole word.
where keywordField like '%, the, %'
or keywordField like '%, the'
or keywordField like 'the, %'
maybe I didn't understand the question properly... but If you want all the words where 'the' appears, a LIKE '%word%' should work.
If the DB of words is HUGE MySQL may fail to retrieve some of the words, that can be solved in 2 ways...
1- get a DB that support bigger sizes (not many ppl would chose this one tho). For example SQL Server has a 'CONTAINS' function that works better than LIKE '%word%'.
2- use a external search tool that uses inverted index search. I used Sphinx for a project and it works quite good. This is better if you rarely UPDATE the rows of the data you want to search from, which should be the case.
Sphinx for example would generate a file from your MySQL table and use this file to solve the search (it's very fast), this file should be re-indexed everytime you do a insert or update on the table, making it a much better solution if you rarely update or insert new rows.
It looks like you have a one to many relationship going on within a column. It might be better to create a separate table for keywords with a row for each keyword and a foreign key to whatever it is you're searching on.
Doing like '%???%' is generally a bad idea because the DB can't make use of an index so it will scan the whole table. Whether this matters will depend on the size of data you're working with but its worth considering up front. The single best way to help DB performance is in the initial table design. This can be tricky to change later.
My users put search terms in an input field. The script I'm using stores that in a $find variable. It then searches mysql like so:
"select * from mytable WHERE title LIKE '%$find%' or description LIKE '%$find%'";
I want to be able to pull a result, even if 1 word is in the title and another in the description, which currently doesn't happen. So.. I'm guessing I need to break the $find variable into an array with a function. After that, how would I do the search in mysql? Since I never know how many words will be in the search (they might decide to search for 8 words at once), how do I reference the array in the mysql query?
Thx in advance!
You should use a FULLTEXT index and build a proper boolean search query on it.
You MAY achieve the effect you want without it, but this would be very slow compared to using FULLTEXT index. You would have to use explode to get the list of words, then build a WHERE condition like concat(title, description) LIKE '%word1%' AND concat(title, description) LIKE '%word2%' etc.
(In my first answer I stated that achieving this effect is impossible without FULLTEXT. I was wrong and edited this answer.)