search engine script that can index hash-fragment urls - php

i was wondering if anyone had a link or tutorial for a search script that indexes urls with the # fragment.
For instance my url 'mysite.com/#About/index' isnt recognised by search engines etc. But i was wondering if there is a script i can use in my website that will help visitors search through these urls?
Entering 'mysite.com/#About/index' into an address bar would return the single page interface with the dynamic content loaded into it already.
I might be thinkin way ahead of myself but maybe a search script that indexes all pages then before displaying results sticks a '#' after the first '/' and takes off the '.php' from the end. Is this even possible??
Able to use php jquery and mysql

Have a look at this link to find out how you can make your content accessible to Google.
Google have established guidelines on this matter
http://code.google.com/web/ajaxcrawling/
http://code.google.com/web/ajaxcrawling/docs/getting-started.html

Related

Crawling websites and dynamic urls

Do search engine robots crawl my dynamically generated URLs? With this I mean html pages generated by php based upon GET variables in the url. The links would look like this:
http://www.mywebsite.com/view.php?name=something
http://www.mywebsite.com/view.php?name=somethingelse
http://www.mywebsite.com/view.php?name=something
I have tried crawling my website with a test crawler found here: http://robhammond.co/tools/seo-crawler but it only visits my view page once, with just one variable in the header.
Most of the content on my website is generated by these GET variables from the database so I would really like the search engines to crawl those pages.
Some search engines do, and some don't. Google for one does include dynamically generated pages: https://support.google.com/webmasters/answer/35769?hl=en
Be sure to check your robots.txt file to ensure files you do not want the crawlers to see are blocked, and that files you do want indexed are not blocked.
Also, ensure that all pages you want indexed are linked via other pages, that you have a sitemap, or submit individual URLs to the search engine(s) you want to index your site.
Yes, search engines will crawl those pages, assuming they can find them. Best thing to do is to simply create links to those pages on your website, particularly accessible, or at least traversable from the home page.

PHP scripting to use search engine

I'm making a simple website with a form where a user enters a word and then I take that word and google it using a php script and displaying the results. How would I be able to do this? or where would I get a script to do this?
I'm fairly sure reusing their results as you want to do is against their Terms of Service.
Why not simply create a form that redirects to the Google result page? Here is a tutorial on how.
Linking to Google Search Results
You would look for Google Site Search - http://www.google.com/sitesearch/
First, as quick solution, you should redirect your users to Google's result page.
When your sript receives the search term, it may
change spaces to "+" sign,
append it to the following URL: http://www.google.com/search?q=site%3Aexample.com+ (change example.com to your domain),
redirect to it.
Example: http://www.google.com/search?site=&q=site%3Astackoverflow.com+banana

How to make search engines index search results on my website?

I have a classifieds website.
It has an index.html, which consists of a form. This form is the one users use to search for classifieds. The results of the search are displayed in an iframe in index.html, so the page wont reload or anything. However, the action of the form is a php-page, which does the work of fetching the classifieds etc.
Very simple.
My problem is, that google hasn't indexed any of the search results yet.
Must the links be on the same page as index.html for google to index the Search Results? (because it is currently displayed in an iframe)
Or is it because the content is dynamic?
I have a sitemap which works, with all URLS to the classifieds in the sitemap, but still not indexed.
I also have this robots.txt:
Disallow: /bincgi/
the php code is inside the /bincgi/ folder, could this be the reason why it isn't being indexed?
I have used rewrite to rewrite the URLS of the classifieds to
/annons/classified_title_here
And that is how the sitemap is made up, using the rewritten urls.
Any ideas why this isn't working?
Thanks
If you need more input let me know.
If the content is entirely dynamic and there is no other way to get to that content except by submitting the form, then Google is likely not indexing the results because of that. Like I mentioned in a comment elsewhere, Google did some experimental form submission on large sites in 2008, but I really have no idea if they expanded on that.
However, if you have a valid and accessible Google Sitemap, Google should index your classifieds fine. I suggest to use the Google Webmaster Tools to find out how Google treats your site and to diagnose any potential problems with crawling.
To use ebay is probably a bad example as its not impossible that google uses custom rules for such a popular site.
Although it is worth considering that ebay has text links to categories and sub categories of auction types, so it is possible to find auction items without actually filling in a form.
Personally, I'd get rid of the iframe, it's not unreasonable when submitting a form to load a new page.
that question is not answerable with the information given, to many open detail questions. if you post your site domain and URLs that you want to get indexed.
based on how you use GWT it can produce unindexable content.
Switch every parameters to GET
Make html links to those search queries on "known by Googlebot" webpages
and they'll be index

I have an iframe which content I need Google to index. Is this possible?

I have a classifieds website.
The index.html has a form:
<form action="php_page" target="iframe" etc...>
The iframe displays the results, and the php_page builds the results for the iframe. Basically the php_page builds a table containing the results from a mysql db, and outputs it.
My problem is that this doesn't get indexed by google.
How can I solve this?
The reason I used an Iframe in the first place was to avoid page-reloading when hitting submit.
Ajax couldn't be used due to various reasons I wont go into here.
Any ideas what to do?
Thanks
UPDATE:
I have a sitemap with URLS to all the classifieds also, but I don't think this guarantees google to spider those URLS.
Trying to make the google spider crawl the results of a search form is not really the right approach.
Assuming you want google.com users to find your classifieds ads by searching google, the best approach is to create a set of static html pages from the ads, and link them (not invisibly) from elsewhere on your site (probably best from the home page - but such a link can be in a footer or something else unobtrusive)
They can also be linked to from your sitemap XML (you do have a sitemap XML file don't you?)
Note: the <iframe> doesn't really come into this. Or Ajax.
There is no way to make any webspider fill out and submit forms.
Workaround: Every night, create a dump of the database and save the HTML to a file. Create a link from index.html to that file. Use CSS classes to make the link invisible. This way, Google will pick it up but users won't see it.

How to redirect a Google search result to a dynamic Web page?

I'm trying to enter a list of items into Google Base via an XML feed so that, when a user searches for one of these items and then clicks the search result link in Google Base (or plain Google), the user is directed to a dynamic Web page on my Web site. I'm assuming that the only way to specify a specific link (either static or dynamic) is through the attribute in the XML feed. Is that correct? So, for example, if my attribute is:
http://www.example.com/product1-info.html
the user will be directed to the product1-info.html page.
But if, instead of a static product page, I want to have the user redirected to a dynamic page that generates search results from my local database (on my Web site) for all products containing the keyword "product1", would I be able to do something like this?:
http://www.example.com/products.php?productID=product1
Finally, and most importantly, is there any way to specify this landing page (or any specific landing page) from a "regular" Google search? Or is it only possible via Google Base and the attribute? In other words, if I put a bunch of stuff into Google Base, if any of it shows up in a regular Google search, is there a way for me to control what parameters get passed to the landing page (and thus, what search is performed on the landing page), or is that out of my control? I hope I explained this correctly. Thanks in advance for any help.
first question: Yes, urls containing a query_string part are allowed.
http://base.google.com/support/bin/answer.py?hl=en&answer=78170 says:XML example:
<link>http://www.example.com/asp/sp.asp?cat=12&id=1030</link>
--
Let me rephrase the second question to see if I understand it correctly (might be completely on the wrong track): E.g. products.php?productID=product1 performs a db-search for the product "FooEx" and products.php?productID=product2 for "BarPlus". Now you want google to show the link .../products.php?productID=product1 but not ....?productId=product2 if someone searched for "FooEx" and google decided that your site is relevant? Then it's the same "problem" we all face with search engines: communicate what each url is relevant for. I.e. e.g. have the appropriate (and only the appropriate) keywords appear in the title/h1 element of the page, avoid linking to the same contents with different urls (e.g. product.php?x=1&productId=1 <-> product.php?productId=1&x1, different urls requesting most probably the exact same contents), submit a sitemap, and so on and on....
edit:
and you can avoid the query-string part all together by using something like mod_rewrite (e.g. the front controller for the zend framework makes use of it) or by parsing the contents of $_SERVER["PATH_INFO"] (this requires the webserver to provide that information), e.g. http://localhoast/test.php/foo/bar -> $_SERVER['PATH_INFO']=='/foo/bar'
Also take a look at the link to this thread: How to redirect a Google search result to a dynamic Web page?, it contains the title of the thread, but SO is perfectly happy with How to redirect a Google search result to a dynamic Web page?, too. The title is "only" additional data for search engines and (even more) the user.
You can do the same:
http://www.example.com/products.php/product1/FooEx <-> http://www.example.com/products.php/product2/BarPlus

Categories