I want to implement a powerful search engine for my ecommerce application. im using php and mysql as database. Can anyone guide me how to proceed? Is the FULL TEXT feature of MYSQL good for a large volume of data?
Thanks!
IMHO, the MySQL Full text engine is a really poor choice.
Firstly, the number of parameters to tweak the search is almost 0.
Secondly, from my experiencem it doesn't scale.
You might consider using
Sphinx
Lucene
Lucene is said to be the industry standard project. They have solr if you want to have a separate architecture.
They are far more advanced and perform better.
This should get you started, however you will have to modify or expand on the idea.
For the second part of your question, have a look at:
Pros & Cons of Full Text Search
Recently, for an app handling a huge amount of data, we have given up both MySQL FULL TEXT and Lucene to switch on PostgreSQL which has a much more powerful native FULL TEXT engine. At least, it was what the results of our investigations said.
Take a look at the Zend_Lucene from Zend_Framework and a new feature for mysql full text search here
Related
I am writing a website which indexes large amounts of data into databases (each with about 800 tables per database), and the website allows you to search the database for various items. Should I use something like lucene or just write my own search algorithm? I am using PHP and MySQL. Although I can filter my SELECT queries, and create a searching algorithm I just wanted to know if I should use Lucene because I am just indexing stuff in a database. Also please do suggest anything that might help me. Forgot to mention that even though I have 800 tables they would be pretty small in size.
Lucene is a mature, tested, open source library.
I would definetly say: try to use it as much as possible, it will probably be better and consume less time then implementing your own library.
If there is a certain functionality that lucene does not provide - you can always create your own variation of lucene to take care of it.
Do not underestimate the importance of the community in using products such lucene: Help is almost always available in lucene's forums [and SO], and the library is constantly tested and maintained because of the large number of users!
Without seeing your data answering this question is very hard, however I can say from personal experience that writing a search of any kind quickly becomes very complex. You have to worry about weighting the various columns you are searching, and search in SQL is almost never as fast as search in a dedicated search engine. At work we are switching from an in house SQL based search to Sphinx Search to search our product catalog because of this very reason.
I'm developing a site that could be compared with a tube site (like YouTube). I'm in the design phase and am trying to figure out what search method to go with.
I'm using SilverStripe framework which has modules for Sphinx, Solr, and Lucene so they are obviously interesting. Another option is to simply query the database (MySQL) and not use any search engine.
What would you do? And why?
Any input is appreciated! Thanks in advance!
simply query the database (MySQL) and not use any search engine
I assume you want to use MyISAM's full-text search capabilities? This is possible, SilverStripe's default configuration is currently (at least until version 2.4) set to MyISAM and not InnoDB. However, this is only recommended for simple, small, and not performance hungry tasks - I assume that's not what you want.
More powerful (both in terms of speed and feature wise) are dedicated search services.
For a general overview, take a look at ElasticSearch, Sphinx, Lucene, Solr, Xapian. Which fits for which usage? for example.
With the details you've given, any of the five should get your job done, but you might give that some more consideration.
However, I would also take into consideration, for which search services SilverStripe modules are already available, how well they fit your requirements, and how much you "like" them. Unless you'd want to write a module for ElasticSearch for example - that would be pretty cool, but I'm not sure it's really worth the effort.
Personally, I'd probably go with https://code.google.com/p/lucene-silverstripe-plugin/ as it's easy to set up and seems to be working well (haven't tried it myself, but I have only heard good things from others about it).
I need to design a search form and the code behind it.
I'm not very familiar with searches.
My table have the following aspect:
- Table_ads
site_name
ad_type
uri
pagetitle
name_ad
general_description_ad
age
country_ad
location_ad
zone_ad
Initially my idea was to do a search like google, we have a single text box and the button search, but I think this will be difficult. The other option is to build a search by fields(traditional search)
What do you think about this subject. What type of search should I do?
Best Regards,
PS: Sorry my English.
For "google-like" search it's best to use Full-Text Search (FTS) solution.
PostgreSQL 8.3 and newer has a built-in FTS engine, and it will let you do all querying in SQL. Example:
SELECT uri FROM ads WHERE fts ## plainto_tsquery('cereal');
See documentation -> http://www.postgresql.org/docs/current/static/textsearch.html and come back if you have more questions :-)
However, in-database FTS is several times slower than dedicated FTS.
So if you need better performance, you will have to build an index outside of database,
Here I would recommend Sphinx -> http://sphinxsearch.com/docs/current.html, Sphinx integrates smoothly with PHP. You feed it with documents (preferably, in form of special XML docset) and update the index on demand or with some scheduler. Then you do searching directly from PHP (not touching the database).
HTH.
Can anyone help me with a good list of php site search engines. I am thinking of implementing a google site search, but I would rather not pay for that and I would rather have as much control as I can over it.
Read through Roll your own Search Engine with Zend_Lucene.
The article is rather old though, so have a look at the ZF Reference Guide about Zend_Lucene too. Searching for Zend Lucene on Google should yield plenty useful results too.
Sphinx is pretty good, but it isn't written in PHP. It has got PHP libraries to interface with it though. You could also have a look at Zend_Search_Lucene from Zend Framework. Both of these make search indexes so you can do fast searches.
You can try the Zend Lucene implementation:
http://framework.zend.com/manual/en/zend.search.lucene.html
http://devzone.zend.com/article/91
You don't have to pay for Google Site Search and there's a small chance for much control means greater quality of results.
If your site is very specific you need to write you own code for search.
Sphinx is one of the best Open Source Search Engines. It has an excellent PHP API. Has very good community and forum too. PHP API for Sphinx comes embedded with the tar/zip file that you will download and with ease it can be embedded on top of your database. Has great vertical search capabilities. Its pretty simple to implement, try it out.
Here is a new PHP Search engine script, that can be implemented in any website, it is made with PHP 5.4+, MySQL, and Ajax.
https://sourceforge.net/projects/site-search-engine-php-ajax/
It crawls and indexes automatically the site pages, similar to Sphider.
It can uses PDO or MySQLi for connecting to MySQL database.
I have a site that lists movies. Naturally people make spelling mistakes when searching for movies, and of course there is the fact that some movies have apostrophes, use letters to spell out numbers in the title, etc.
How do I get my search script to overlook these errors? Probably need something that's a little more intelligent than WHERE mov_title LIKE '%keyword%'.
It was suggested that I use a fulltext search engine, but all of those things look really complicated, and I feel that building them into my application will be like hell on earth. If I do have to use one, what's the least invasive one, that will be most painless to implement into existing code?
I think you'll have to implement an external fulltext search engine. MySQL just isn't good at fulltext search. I'd say you should give Lucene a go (tutorials). Zend Framework has an API that plugs into Lucene, making it easier to learn and utilize.
Presuming that you use MySQL - MySQL has no in-built functionality that is capable of doing this.
This means you will have to implement a full-text search yourself, or use a third party full text search tool.
If you implement it yourself, you should look into the metaphone or double metaphone algorithms (I'd recommend them over soundex, which is not nearly as good at this type of task), to store phoenetic representations of all your words. However, building your own full text search is no task for the faint-hearted. Don't attempt it if you don't consider yourself a database wizard.
If you want a third party tool, Lucene is the way to go. It is ported into tons of different languages/platforms including PHP - you don't have to use Java.
I've used neither php nor mysql, but an alternative to full text search might be soundex searches.