Implementing a site search engine that searches static pages - php

What I would like to accomplish is to integrate a search feature into my website that is capable of searching my web pages that are static(content does not change). I need the search engine to be free to use and must operate using JavaScript or PHP (and MySQL if needed). I have tried looking on Google (if anyone is wondering) , but maybe I'm just not searching for the right thing. If anyone could point me in the right direction I would greatly appreciate it.
Thanks

Why reinvent the wheel - use Google Custom Search: http://www.google.com/cse/

i got something today , so updating for other users
Google Internal Site Search script (JavaScript, free)
Need a powerful internal search engine script to allow visitors to search the contents of your site? This script uses Google to enable comprehensive search on your site. Cut and paste installation that works on any type of sites.
Sphider (PHP, free)
Sphider is a lightweight web spider and search engine written in PHP, using MySQL as its back end database. It is suitable for adding search functionality to small or medium sites (up to around 20,000 pages). It also works great as a tool for site analysis - finding broken links, gathering statistics about the site etc.
TSEP (PHP, free)
TSEP is a search engine for a website for your website! You can put a "Search this site" anywhere on your website and let people quickly find what they are looking for.
Zoom Search Engine (PHP, commercial $49-$99)
Zoom is a robust PHP script for adding powerful custom search engine to your website, intranet, or CD/DVD.
Perlfect Search (Perl, free)
An integrated, general purpose, site indexer and search engine. It comes as a pair of distinct scripts. The indexer, that automatically, scans and indexes a web site, and the search engine, a cgi script that serves search queries for keywords over the index, and displays results pages in html, in a standard format including title, description and relevance ranking for each matching document.
CGIWorld Site Search (Perl, commercial $25)
SiteSearch gives you the ability to search your website quickly & easily by the use of the password protected browser based administration area. Set the path of the directory you want searched, set the files & directories you want searched, and also the directories & files you do not want searched. SiteSearch is a great tool for the average website of around or below 500 pages.
Fluid Dynamics Search Engine (Perl, free and commercial versions)
FDSE is an easy-to-install search engine for local and remote sites. It returns fast, accurate results from a template-driven architecture. Freeware and shareware versions are available with Perl source.
ASP Site Search (ASP, free)
This ASP Site Search application is commented on each line of code to make it easier for a beginner to follow or to customise. The Site Search application comes in two versions the Advanced version has more functions but requires that the web server has the VB Scripting Engine 5 or above installed.
Site Search Pro (ASP, commercial)
Site Search Pro 2.0 is comprehensive search script for ASP or PHP site
Refer : http://www.javascriptkit.com/howto/search2.shtml

You might want to look at this. (For anyone who struggles their way through this problem)
JSE internal seach engine
http://www.javascriptkit.com/script/script2/jse/
Uses regular expressions to efficiently and rapidly search the index for matches based on the entered keywords. Supports basic logic (ie: negation).
Returns the results on a seperate page from the search form itself, neatly formatted. Uses session cookies to transmit the query between the two pages.
Stores the index (url, keywords and description for each page you wish to be "crawled") in the "results" page. This means the index is loaded only when a search has actually been performed, saving on bandwidth and download time.
Searches title, description and desingated keywords within the index for a match.

"Sphider is a lightweight web spider and search engine written in PHP, using MySQL as its back end database. It is a great tool for adding search functionality to your web site or building your custom search engine. Sphider is small, easy to set up and modify, and is used in thousands of websites across the world."
http://www.sphider.eu/

A bit late, anyway I would suggest Tipue-search.
Its pure javascript and can be integrated with any page.
https://github.com/Tipue/Tipue-Search

Swiftype is another more recent addition to the market too: https://swiftype.com/

Related

How do I build a search engine in PHP to search live content of multiple sites?

I am a relatively novice programmer with a good understanding of PHP but more of the case of read, understand and copy the bits I need rather than develop from scratch.
I have a list of over 1000 URLs I would like to search. I would like to search those pages for content on demand and return only results containing the text query I provide. I have looked at Google Custom Search Engine as an easy option and this works well but limits the amount of pages I can add.
I've looked into cURL but doesn't seem to offer what I'm looking for unless I'm missing something?
Or are there other options like Google CSE that are free and easy to use?
You can write crawler for needed pages and use Sphinx engine(http://sphinxsearch.com/) for search in pages. For my opinion, should write a crawler with HTTP extension is better than pure cURL lib.

Make a JavaScript-aware Crawler

I want to make a script that's crawling a website and it should return the locations of all the banners showed on that page.
The locations of banners are most of the time from known domains. But banners are not in the HTML as an easy image or swf-file. Most of the times a Javascript is used to show the banner.
So if a .swf-file or image-file is loaded from a banner-domain, it should return that url.
Is that possible to do? And how could I do that roughly?
Best would be if it can also returns the landing page of that ad. How to solve that?
You could use selenium to open the pages in a real browser and then access the DOM.
PhantomJS might also be worth a look - it's a headless version of WebKit (the engine behind Chrome, Safari, etc.).
However, none of those solutions are pure php - if that's a requirement, you'll probably have to write your own JavaScript engine in PHP (which is nothing I'd ask my worst enemy to do ;))
In order to get the output of the JavaScript you will need a JavaScript engine (such as Google's V8 Engine). The V8 engine is written in C++ but there are some resources that tell you embed the V8 engine into PHP.
With that said, you have to study the output "by hand" and determine exactly what can be scraped and how to identify it. Once you've identified some common syntax for the advertisement banners, then you can write a script to extract the banner and the landing page which is referenced.
None of this is easy work, but if you have an example of an ad you'd like to collect then I can give you more advice.

advanced searchbox with php

If you visit the simple machines (SMF) forums you will notice at the top of the page they have a searchbox and next to it they have drop down box which allows you to fine tune your search to the different parts of the site and I was curious if there was a script that would allow me to add this same functionality to my site or any places where I can find some examples that I could learn from to build such a script on my own (note: I have very little programming knowledge so the easier the better)
TIA,
kristin
Is this what you're looking for?
Search Mod

Search for contents inside a website

Anybody please give some useful links on this topic.i need to create a content search for my website.. i have tried google but not get useful materials on this topic...please help me
While google custom search is a good solution, and you didn't give much information, a simple google search does turn up some good results:
Sphider, which I think I used years ago:
Sphider is a lightweight web spider and search engine written in PHP, using MySQL as its back end database. It is a great tool for adding search functionality to your web site or building your custom search engine. Sphider is small, easy to set up and modify, and is used in thousands of websites across the world.
PhpDig (on the 2nd page of results, so it was hard to find), I know I've used this before, another 'installable' php based search engine:
PhpDig is a web spider and search engine written in PHP, using a MySQL database and flat file support. PhpDig builds a glossary with words found in indexed pages. On a search query, it displays a result page containing the search keys, ranked by occurrence.
Sphinx + PHP, an older article, I can't really speak to how well it fits your needs, but it might be a good place to start if you don't want to use a ready made script:
While Google and its ilk are virtually omniscient, the Web's mighty search engines aren't well suited to every site. If your site content is highly specialized or distinctly categorized, use Sphinx and PHP to create a finely tuned local search system.
About's PHP Search Tutorial, certianlly nothing special (it's quite the simplification of a search engine), but another place to start if you want to write it yourself:
Our search engine tutorial assumes that all the data you want to be searchable is stored in your MySQL database. It will not have any fancy algorithms - just a simple LIKE query, but it will work for basic searching and give you a jumping off point to make a more complex searching system.
Of course, more information would mean better answers.
have tried google but not get useful materials on this topic
Have you tried Google?
Seriously, Google Custom Search is very easy to set up and does not require any PHP programming. It doesn't integrate 100% in your site's design but works well.

How to build a in-site search engine with php?

I want to build a in-site search engine with php. Users must login to see the information. So I can't use the google or yahoo search engine code.
I want to make the engine searching for the text and pages, and not the tables in mysql database right now.
Has anyone ever done this? Could you give me some pointers to help me get started?
you'll need a spider that harvests pages from your site (in a cron job, for example), strips html and saves them in a database
You might want to have a look at Sphinx http://sphinxsearch.com/ it is a search engine that can easily be access from php scripts.
You can cheat a little bit the way the much-hated Experts-Exchange web site does. They are for-profit programmer's Q&A site much like StackOverflow. In order to see answers you have to pay, but sometimes the answers come up in Google search results. It is rather clear that E-E present different page for web crawlers and different for humans. You could use the same trick, then add Google Custom Search to your site. Users who are logged in would then see the results, otherwise they'd be bounced to login screen.
Do you have control over your server? Then i would recommend that you install Solr/Lucene for index and SolPHP for interacting with PHP. That way you can have facets and other nice full text search features.
I would not spider the actual pages, instead i would spider pages without navigation and other things that is not content related.
SOLR requiers Java on the server.
I have used sphider finally which is a free tool, and it works well with php.
Thanks all.
If the content and the titles of your pages are already managed by a database, you will just need to write your search engine in php. There are plenty of solutions to query your database, for example:
http://www.webreference.com/programming/php/search/
If the content is just contained in html files and not in the db, you might want to write a spider.
You may be interested in caching the results to improve the performances, too.
I would say that everything depends on the size and the complexity of your website/web application.

Categories