Better performance searching SQL table with 170,000 rows - php

I have SQL table with 170,000 rows. Each row has column for string long approximately 600 characters.
I want to list all the rows, that contains searched keyword.
Using LIKE '% keyword %' takes about 1000ms. My app is build in Laravel using Eloquent.
Do you have any ideas what way would be the best for performane? I need to have options for searching case sensitive/insesitive, accent sensitive/insesitive, exact phrase or just multiple words in random order. So when I tried TNTSearch, the performance was excellent, but with not so much options.
Also, I tried to create index and Match Against function in my query, but there are also some limitations.

Define FULLTEXT indexes on the columns which you want to be able to search to greatly increase search speed. Only works on MyISAM and InnoDB tables though.
Google it or have a look here

Like queries that begin with a wildcard cannot take advantage of indexes. Performance will continue to degrade as your table size grows.
I would recommend one of the following options for improving performance:
You can use Laravel Scout
Laravel Scout provides a simple, driver based solution for adding full-text search to your Eloquent models.
Out of the box Scout supports Algolia, but there are other drivers available as well, including TNTSearch
https://github.com/teamtnt/laravel-scout-tntsearch-driver
You can use a fulltext index to improve search performance.
Eloquent does not support fulltext search out of the box, but there are a few third party packages that add support.
Ex:
https://github.com/jarektkaczyk/eloquence-base
https://github.com/swisnl/laravel-fulltext

I'd recommend using SOLR in combination with Solarium: https://solarium.readthedocs.io/en/latest/

Related

faster way for Search in multiple databases

I am working on big eCommerce shopping website. I have around 40 databases. i want to create search page which show 18 result after searching by title in all databases.
(SELECT id_no,offers,image,title,mrp,store from db1.table1 WHERE MATCH(title) AGAINST('$searchkey') AND title like '%$searchkey%')
UNION ALL (SELECT id_no,offers,image,title,mrp,store from db3.table3 WHERE MATCH(title) AGAINST('$searchkey') AND title like '%$searchkey%')
UNION ALL (SELECT id_no,offers,image,title,mrp,store from db2.table2 WHERE MATCH(title) AGAINST('$searchkey') AND title like '%$searchkey%')
LIMIT 18
currently i am using the above query its working fine for 4 or more character keyword search like laptop nokia etc but takes 10-15 sec for processes but for query with keyword less than 3 characters it takes 30-40sec or i end up with 500 internal server error. Is there any optimized way for searching in multiple databases. I generated two index primary and full text index with title
Currently my search page is in php i am ready to code in python or any
other language if i gets good speed
You can use the sphixmachine:http://sphinxsearch.com/. This is powerfull search for database. IMHO Sphinx this best decision
for search in your site.
FULLTEXT is not configured (by default) for searching for words less than three characters in length. You can configure that to handle shorter words by setting a ...min_token_size parameter. Read this. https://dev.mysql.com/doc/refman/5.7/en/fulltext-fine-tuning.html You can only do this if you control the MySQL server. It won't be possible on shared hosting. Try this.
FULLTEXT is designed to produce more false-positive matches than false-negative matches. It's generally most useful for populating dropdown picklists like the ones under the location field of a browser. That is, it requires some human interaction to choose the correct record. To expect FULLTEXT to be able to do absolutely correct searches is probably a bad idea.
You simply cannot use AND column LIKE '%whatever%' if you want any reasonable performance at all. You must get rid of that. You might be able to rewrite your python program to do something different when the search term is one or two letters, and thereby avoid many, but not all, LIKE '%a%' and LIKE '%ab%' operations. If you go this route, create ordinary indexes on your title columns. Whatever you do, don't combine the FULLTEXT and LIKE searches in a single query.
If this were my project I'd consider using a special table with columns like this to hold all the short words from the title column in every row of each table.
id_pk INT autoincrement
id_no INT
word VARCHAR(3)
Then you can use a query like this to look up short words
SELECT a.id_no,offers,image,title,mrp,store
FROM db1.table1 a
JOIN db1.table1_shortwords s ON a.id_no = s.id_no
WHERE s.word = '$searchkey'
To do this, you will have to preprocess the title columns of your other tables to populate the shortwords tables, and put an index on the word column. This will be fast, but it will require a special-purpose program to do the preprocessing.
Having to search multiple tables with your UNION ALL operation is a performance problem. You will be able to improve performance dramatically by redesigning your schema so you need search only one table.
Having to search databases on different server machines is a performance problem. You may be able to rig up your python program to search them in parallel: that is, to somehow use separate tasks to search each one, then aggregate the results. Each of those separate search tasks requires its own connection to the data base, so this is not a cheap or simple solution.
If this system faces the public web, you will have to redesign it sooner or later, because it will never perform well enough as it is now. (Sorry to be the bearer of bad news.) Many system designers like to avoid redesigning systems after they become enormous. So, if I were you I would get the redesign done.
If your focus is on searching, then bend the schema to facilitate searching rather than the other way around.
Collect all the strings to search for in a single table. Whereas a UNION of 40 tables does work, it will be ~40 times as slow as having the strings collected together.
Use FULLTEXT when the words are long enough, use some other technique when they are not. (This addresses your 3-char problem; see also the Answer discussing innodb_ft_min_token_size. You are using InnoDB, correct?)
Use + and boolean mode to say that a word is mandatory: MATCH(col) AGAINST("+term" IN BOOLEAN MODE)
Do not add on a LIKE clause unless there is a good reason.

What's the most efficient way to search multiple MySQL tables for a large quantity of terms?

Taking a PHP array of terms with variable length (i.e. it could be 50 terms, it could be 400), what's the most efficient way of searching my database for each of these terms?
The search I'm trying to do is quite straightforward. For each term, I'd like to do:
SELECT id, post_title FROM wp_posts WHERE post_title LIKE %term%
Obviously I can run a foreach in PHP and run multiple MySQL queries, but I'd imagine this to be hugely inefficient.
The code I've most recently tried involves multiple OR statements, but with ~100ish terms it appears to run very slowly.
I have no idea if something like this would work?
SELECT id, post_title FROM wp_posts WHERE post_title LIKE %term1%, %term2%, %term3%, %term4%, [...]
Can I use a more efficient SQL statement, or should I be looking at this in a different way?
Stock MySQL could handle this kind of search using
MATCH (post_title) AGAINST ('term1 term2 term3 term4')
To do this search you will need to add Full Text index into the table using
ALTER TABLE wp_posts ADD FULLTEXT INDEX ft_key1(post_title);
This would be way faster than LIKE %term%, but please note that Full-Text indexes are only supported in MyISAM tables (InnoDB supported this syntax since MySQL 5.6).
However as your data grow bundled MySQL search speed might become an issue. In this case I would suggest to use external search engine like Solr or Sphinx.
If you decided to switch to Sphinx you may want to take a look on this guide http://astellar.com/2011/12/replacing-mysql-full-text-search-with-sphinx/
Create an additional index holding a single column where you simply concatenate the values of all table columns you want to query during the search. This way you can use a single SELECT query with a LIKE clause to search through all columns at once.
This is often referred to as "full text search".
Using MySQL the most efficient way is to set up Full-text searching... http://dev.mysql.com/doc/refman/5.0/en/fulltext-search.html
I don't know whether its any more efficient than the other suggestions but you could also do
SELECT id, post_title FROM wp_posts WHERE FIND_IN_SET(post_title, '%term1%, %term2%, %term3%, %term4%') <> 0

Making search facility

I want to make a search facility in my website.I'm using php..
What criteria should be taken for searching.
For ex: if someone searches
How to make soap
I can use many approaches for the search like finding database entries having exactly the same search string
or finding the database entries in the order of search keywords(ie . entry with search string "How" +"Soap" will have less preference than entry having search string "how soap make")...
So what is the algorithm generally used for searching.?
Also what is meant by full text search?
This is kind of a big subject for a simple answer, but I think what you mean is how to run complex fulltext searches on MySQL. In other words, this is really a MySQL question, not a PHP one.
Basically, you need to:
1. Create a fulltext index on a text field in your database.
2. Run queries on that database field using MySQL's fulltext syntax.
The basic syntax for querying a fulltext indexed table in MySQL is:
SELECT * FROM table
WHERE MATCH (fulltextfield)
AGAINST ('my search phrase');
There's a lot more to it than that, but the MySQL documentation is the place to go: http://dev.mysql.com/doc/refman/5.0/en/fulltext-natural-language.html
If you want to do really advanced fulltext searches, a good recommendation is Sphinx, but that's probably way more advanced than you need.

MySql select question

I have a search engine. I have to select some data from a table when a user types in the search keywords.
I want to find an alternative to 'LIKE':
SELECT id,text FROM example WHERE text LIKE '$search'
Because the text column has usually loads of words in it and the search term always contains a few words, I don't get accurate results.
Is there any other way of doing this?
It's called full-text indexing, but currently it's not supported in InnoDB, only in MyISAM. Alternative is to use third-party indexing, like Lucene, Solr (which provides web service access on top on Lucene), Sphinx...
If your table is MyISAM, you can enable full text searching:
ALTER TABLE table ADD FULLTEXT idx_text (`text`);
You could take a look at the MySQL Fulltext mechanism, it provides natural language searches in a fairly easy to use way.
Usually a search engine is using a tree data structure for fast lookups and some sort of a graph to weight the search result. Maybe you want to look into a trie data structure and space-filling-curve. The latter is useful if you want to compare 2 documents. For example if you sort and count all the words you can do a heat map.

How can I search for multiple terms in multiple table columns?

I have a table that lists people and all their contact info. I want for users to be able to perform an intelligent search on the table by simply typing in some stuff and getting back results where each term they entered matches at least one of the columns in the table. To start I have made a query like
SELECT * FROM contacts WHERE
firstname LIKE '%Bob%'
OR lastname LIKE '%Bob%'
OR phone LIKE '%Bob%' OR
...
But now I realize that that will completely fail on something as simple as 'Bob Jenkins' because it is not smart enough to search for the first an last name separately. What I need to do is split up the the search terms and search for them individually and then intersect the results from each term somehow. At least that seems like the solution to me. But what is the best way to go about it?
I have heard about fulltext and MATCH()...AGAINST() but that sounds like a rather fuzzy search and I don't know how much work it is to set up. I would like precise yes or no results with reasonable performance. The search needs to be done on about 20 columns by 120,000 rows. Hopefully users wouldn't type in more than two or three terms.
Oh sorry, I forgot to mention I am using MySQL (and PHP).
I just figured out fulltext search and it is a cool option to consider (is there a way to adjust how strict it is? LIMIT would just chop of the results regardless of how well it matched). But this requires a fulltext index and my website is using a view and you can't index a view right? So...
I would suggest using MATCH / AGAINST. Full-text searches are more advanced searches, more like Google's, less elementary.
It can match across multiple tables and rank them to how many matches they have.
Otherwise, if the word is there at all, esp. across multiple tables, you have no ranking. You can do ranking server-side, but that is going to take more programming/time.
Depending on what database you're using, the ability to do cross columns can become more or less difficult. You probably don't want to do 20 JOINs as that will be a very slow query.
There are also engines such as Sphinx and Lucene dedicated to do these types of searches.
BOOLEAN MODE
SELECT * FROM contacts WHERE
MATCH(firstname,lastname,email,webpage,country,city,street...)
AGAINST('+bob +jenkins' IN BOOLEAN MODE)
Boolean mode is very powerful. It might even fulfil all my needs. I will have to do some testing. By placing + in front of the search terms those terms become required. (The row must match 'bob' AND 'jenkins' instead of 'bob' OR 'jenkins'). This mode even works on non-indexed columns, and thus I can use it on a view although it will be slower (that is what I need to test). One final problem I had was that it wasn't matching partial search terms, so 'bob' wouldn't find 'bobby' for example. The usual % wildcard doesn't work, instead you use an asterisk *.

Categories