I am currently working on a eCommerce style project that uses a search engine to browse 7,000+ entries that are stored in a database. Every one of these search results contain a link to a full description page. I have been looking into creating clean/slug URLs for this, my goal is if a user clicks on some search result entry the browser will navigate to a new page using the slug URL.
www.mydomain.com/category/brown-fox-statue-23432323
I have a system in place to convert a string / sentence into URL form. However, it is not clear to me what the proceeding steps are once these URL's are created. What is the general plan for implementing this system? Do the URL's need to be stored in a database? Am I suppose to be using post or get data from the search result page to create content in these full description urls?
I appreciate any suggestions!
Many thanks in advance!
Each product has a unique url associated with it in the database.
When you perform a search you just return the correct unique url.
That way you only ever work out what the url should be once, when the product is first added and that url will always relate to that one product. This is the stage you use your system to create that url
Maybe you can enlighten us as to if you are using a framework? Some frameworks (like Zend) have ini / xml files for routing. But you will still need to store the urls or at least the article slugs in a db.
Storing the entry urls in the db after they have been "searched" is necessary because you want slugs to stay the same for entries. This allows for better caching / SEO which will improve your sites usability.
Hope that helps!
Edit: Saw your question about pulling up individual articles. You will have to start by setting up a relation between your entries to urls in your database. Create a url table with url_id, and url. Then place url_id on the entry table. Then whenever someone goes to any URL search the url table for the current url, recall the url_id, and then pull the entry. At that point its just styling the page to make it look the way you want.
A common approach is to have a bijective (reversible) function that can convert a "regular" URL into a user-friendly URL:
E.g.:
www.mydomain.com/category/brown-fox-statue-23432323
<=>
www.mydomain.com/index.php?category=brown-fox-statue-23432323
Then you need not keep record of this mapping (convention vs. configuration).
Search StackOverflow for "User Friendly URL Rewriting" for information on how to achieve this automatically with Apache. This question is a good starting point.
Related
I'm sure I'm not the first person who has thought about this but I haven't had any luck forming that proper search query in google to find the info. Here's what I'm wanting to do:
I have a CodeIgniter based site. I'm going to store basic content into tables in the db. I'm thinking that I would have the domain names stored in table to and use the unique id of the table row as the method of querying the appropriate content from the db for the rest of the views. For example. MyDomain.com is #1 in the row followed by YourDomain.com. If the visitor arrived at the site by typing YourDomain.com then somehow CI would "see" that and then query the content for that domain from the db.
Does this make sense? Has anyone else tried it? Is it possible?
Haven't done it myself, but I did some searching for "codeigniter multi site" and found some useful links, this being one of them that seemed to step you through the process.
In general there is an HTTP_HOST header (or similar header) that identifies the host in the user's request. You can look that header up and then use that to index into your database to extract the right content.
I have created a widget for my web application. User's getting code and just pasting that code in their website and my widget works on their website something like twitter, digg and other social widgets.
My widget is on the basis of post, for a single post (say postid: 234) I am providing single widget, so anyone can embed the widget on their website.
Now I want to know that where all my widget is posted and for which post? for that I have recorded the URL of the site when my widget start (onload) but the problem arises when someone placed the widget in their blog or website's common sidebar. I am recording URL each time and hence if it's in sidebar of a blog then it's recording URL for every post which is creating duplicates.
can anyone help on this? How should I go so that I have only one single record for a widget on a site?
I think doing something like this is a bit tricky. Here are some ideas that pop to mind
You could for example ask the user to input their site's URL when they get the widget, or the widget could track the domain or subdomain, thus giving less URLs.
Just tracking the domain would obviously be problematic if the actual site is domain.com/sitename/, and there could be more than one site under the domain. In that case, you could attempt to detect the highest common directory. Something like this:
You have multiple URLs like this: domain.com/site/page1, domain.com/site/page2, and so on. Here the highest common directory would be domain.com/site.
I don't think that will always work correctly or provide completely accurate results. For accuracy, I think the best is to just ask the user for the URL when they download the code for the widget.
Edit: new idea - Just generate a unique ID for each user. This could be accomplished by simply taking the current timestamp or something, and hiding it into the code snippet the user is supposed to copy. This way you can track the ID itself and any URLs and domains it appears in can be grouped under it.
If you have an ID which doesn't get a hit in say week or something you could remove it from your database, and that way avoid filling it up with unused IDs.
I agree with Jani regarding a unique id. When you dish out the script you'll then be able to always relate back to that id. You are still going to have duplicates if the user uses the same id over and over, but at least you'll have a way of differentiating one user from another. Another useful advantage is that you are now able to, as Jani said, group by the ID and get a cumulative number for all of the instances where that user used the script & id.
I am creating a classifieds website.
Im storing all ads in mysql database, in different tables.
Is it possible to find these ads somehow, from googles search engine?
Is it possible to create meta information about each ad so that google finds them?
How does major companies do this?
I have thought about auto-generating a html-page for each ad inserted, but 500thousand auto-generated html pages doesn't really sound that good of a solution!
Any thoughts and idéas?
UPDATE:
Here is my basic website so far:
(ALL PHP BASED)
I have a search engine which searches database for records.
After finding and displaying search results, you can click on a result ('ad') and then PHP fetches info from the database and displays it, simple!
In the 'put ad' section of my site, you can put your own ad into a mysql database.
I need to know how I should make google find ads in my website also, as I dont think google-crawler can search my database just because users can.
Please explain your answers more thoroughly so that I understand fully how this works!
Thank you
Google doesn't find database records. Google finds web pages. If you want your classifieds to be found then they'll need to be on a Web page of some kind. You can help this process by giving Google a site map/index of all your classifieds.
I suggest you take a look at Google Basics and Creating and submitting SitemapsPrint
. Basically the idea is to spoon feed Google every URL you want Google to find. So if your reference your classifieds this way:
http://www.mysite.com/classified?id=1234
then you create a list of every URL required to find every classified and yes this might be hundreds of thousands or even millions.
The above assumes a single classified per page. You can of course put 5, 10, 50 or 100 on a single page and then create a smaller set of URLs for Google to crawl.
Whatever you do however remember this: your sitemap should reflect how your site is used. Every URL Google finds (or you give it) will appear in the index. So don't give Google a URL that a user couldn't reach by using the site normally or that you don't want a user to use.
So while 50 classifieds per page might mean less requests from Google, if that's not how you want users to use your site (or a view you want to provide) then you'll have to do it some other way.
Just remember: Google indexes Web pages not data.
How would you normally access these classifieds? You're not just keeping them locked up in the database, are you?
Google sees your website like any other visitor would see your website. If you have a normal database-driven site, there's some unique URL for each classified where it it displayed. If there's a link to it somewhere, Google will find it.
If you want Google to index your site, you need to put all your pages on the web and link between them.
You do not have to auto-generate a static HTML page for everything, all pages can be dynamically created (JSP, ASP, PHP, what have you), but they need to be accessible for a web crawler.
Google can find you no matter where you try to hide. Even if you can somehow fit yourself into a mysql table. Because they're Google. :-D
Seriously, though, they use a bot to periodically spider your site so you mostly just need to make the data in your database available as web pages on your site, and make your site bot-friendly (use an appropriate robots.txt file, provide a search engine-friendly site map, etc.) You need to make sure they can find your site, so make sure it's linked to by other sites -- preferably sites with lots of traffic.
If your site only displays specific results in response to search terms you'll have a harder time. You may want to make full lists of the records available for people without search terms (paged appropriately if you have lots of data).
First Create a PHP file that pulls the index plus human readable reference for all records.
That is your main page broken out into categories (like in the case of Craigslist.com - by Country and State).
Then each category link feeds back to the php script the selected value regardless of level(s) finally reaching the ad itself.
So, If a category is selected which contains more categories (like states contain cities) Then display the next list of categories. Else display the list of ads for that city.
This will give Google.com a way to index a site (aka mysql db) dynamically with out creating static content for the millions (billions or trillions) of records involved.
This is Just an idea of how to get Google.com to index a database.
I'm trying to enter a list of items into Google Base via an XML feed so that, when a user searches for one of these items and then clicks the search result link in Google Base (or plain Google), the user is directed to a dynamic Web page on my Web site. I'm assuming that the only way to specify a specific link (either static or dynamic) is through the attribute in the XML feed. Is that correct? So, for example, if my attribute is:
http://www.example.com/product1-info.html
the user will be directed to the product1-info.html page.
But if, instead of a static product page, I want to have the user redirected to a dynamic page that generates search results from my local database (on my Web site) for all products containing the keyword "product1", would I be able to do something like this?:
http://www.example.com/products.php?productID=product1
Finally, and most importantly, is there any way to specify this landing page (or any specific landing page) from a "regular" Google search? Or is it only possible via Google Base and the attribute? In other words, if I put a bunch of stuff into Google Base, if any of it shows up in a regular Google search, is there a way for me to control what parameters get passed to the landing page (and thus, what search is performed on the landing page), or is that out of my control? I hope I explained this correctly. Thanks in advance for any help.
first question: Yes, urls containing a query_string part are allowed.
http://base.google.com/support/bin/answer.py?hl=en&answer=78170 says:XML example:
<link>http://www.example.com/asp/sp.asp?cat=12&id=1030</link>
--
Let me rephrase the second question to see if I understand it correctly (might be completely on the wrong track): E.g. products.php?productID=product1 performs a db-search for the product "FooEx" and products.php?productID=product2 for "BarPlus". Now you want google to show the link .../products.php?productID=product1 but not ....?productId=product2 if someone searched for "FooEx" and google decided that your site is relevant? Then it's the same "problem" we all face with search engines: communicate what each url is relevant for. I.e. e.g. have the appropriate (and only the appropriate) keywords appear in the title/h1 element of the page, avoid linking to the same contents with different urls (e.g. product.php?x=1&productId=1 <-> product.php?productId=1&x1, different urls requesting most probably the exact same contents), submit a sitemap, and so on and on....
edit:
and you can avoid the query-string part all together by using something like mod_rewrite (e.g. the front controller for the zend framework makes use of it) or by parsing the contents of $_SERVER["PATH_INFO"] (this requires the webserver to provide that information), e.g. http://localhoast/test.php/foo/bar -> $_SERVER['PATH_INFO']=='/foo/bar'
Also take a look at the link to this thread: How to redirect a Google search result to a dynamic Web page?, it contains the title of the thread, but SO is perfectly happy with How to redirect a Google search result to a dynamic Web page?, too. The title is "only" additional data for search engines and (even more) the user.
You can do the same:
http://www.example.com/products.php/product1/FooEx <-> http://www.example.com/products.php/product2/BarPlus
How can I make it so that content from a database is available to search engines, like google, for indexing?
Example:
Table in mysql has a field named 'Headline' which equals 'BMW M3 2005'.
My site name is 'MySite'
User enters 'BMW M3 2005 MySite' in google and the record will show up with results?
Google indexes web pages, so you will need to have a page for each of your records, this doesn't mean to say you need to create 1,000 HTML pages, but following my advice above will dynamically / easily provide a seemingly unique page for each product.
For example:
www.mydomain.com/buy/123/nice-bmw-m3-2005
You can use .htaccess to change this link to:
www.mydomain.com/product.php?id=123
Within this script you can dynamically create each page with up-to-date information by querying your database based on the product id in this case 123.
Your script will provide each record with it's own title ('Nice BMW M3 2005'), a nice friendly URL ('www.mydomain.com/buy/123/nice-bmw-m3-2006') and you can include the correct meta information too, as well as images, reviews etc etc.
Job done, and you don't have to create hundreds of static HTML pages.
For more information on .htaccess, check out this tutorial.
What ILMV is trying to explain is that you have to have HTML pages that Google and other search engines can 'crawl' in order for them to index your content.
Since your information is loading dynamically from a database, you will need to use a server-side language like PHP to dynamically load information from the database and then output that information to an HTML page.
You have any number of options for how to accomplish this specifically, ILMV's suggestion is one of the better ways though.
Basically what you need to do first is figure out how to pull the information from the database and then use PHP (or another server-side language) to output the information to an HTML page.
Then you will need to determine whether you want to use the uglier, default url style for php driven pages:
mysite.com/products.php?id=123
But this url is not very user or search engine friendly and will result in your content not being indexed very well.
Or you can use some sort of URL rewriting mechanism (mod_rewrite in a .htaccess file does this or you can look at more complex PHP oriented solutions like Zend Framework that provide what's called a Front Controller to handle mapping of all requests) to make it so that your url's look like:
mysite.com/products/123/nice-bmw-m3-2006
This is what ILMV is talking about with regard to url masking.
Using this method of dynamically loading content will allow you to develop a single page to load the information for a number of different products based on the Id thus making it seem to the various search engines as though you have a unique page for each product.