I have a Joomla! website with rewrite rules activated. My article URl is mysite.com/category/ID-alias.html. The only thing which is important (from this url) is the id, because when I can access the article with any text at "category" and any text at "alias".
Let's show a concrete example:
My article URL: mysite.com/flowers/15-begonia.html
I can access the same by changing category name and alias directly from url:
mysite.com/tralala/15-anything.html //Shows the same article as above.
Is this SEO? If one of my visitors want to destroy my website SEO, can he open my articles with different addresses (like above) and Google will say that articles are duplicated? Does Google knows when a visitor goes to a webpage to which link doesn't exists anywhere?
Hope my question is clear.
Thanks.
Google do a good job of deciding which is the "right" version of a page - it is worth watching this video to see how they handle this situation:
http://www.youtube.com/watch?v=mQZY7EmjbMA
Since these wrong URLs should not be linked to from anywhere, it is unlikely they will be indexed by mistake.
However, should they index the wrong version of a page, setting a sitemap with the right one will usually fix it.
A visitor could not harm your SEO with this knowledge. The worst they could do would be to provide good links to a non-indexed page, which would cause the wrong URL to be indexed. However, it would then be very easy for you to 301 redirect that page and turn their attempts at harm into an SEO benefit.
I personally think Joomla should look into adding the canonical tag, but if you want that currently, you must use an extension like this:
http://extensions.joomla.org/extensions/site-management/seo-a-metadata/url-canonicalization-/25795
(NB I have never used this extension so cannot guarantee its quality - the reviews are good, though)
Related
I'm building a little database driven PHP CMS. I'm trying to figure the best strategy for this case scenario:
I have a URL like this:
http://www.my.com/news/cool-slug
Someone saves or share this URL (or it gets indexed by Google).
Now I realize that the slug is not quite right and change it to:
http://www.my.com/news/coolest-slug
Google and users who previously saved the URL will hit a 404 error.
Is this the best and common solution (showing the 404) or should I keep a table in my database with all the history of the generated URLs mapped to the ID of the page and redirect with a 301 header?
Will this be an unnecessary load on my system (this table can get lots of records...)?
One very common solution used by many sites (including StackOverflow as far as I can tell) is to include the ID in the URL. The slug is just here for SEO/beauty/whatever, but is not used to identify the page.
Example: http://stackoverflow.com/questions/27877901/strategy-for-permanent-links-not-wordpress
As long as you have the right ID, it doesn't matter what slug you use. The site will just detect that the slug is wrong, and generate a redirect to the right one. For example, the following URL is valid:
http://stackoverflow.com/questions/27877901/old-slug
If for some reason you do not want the ID in the URL, then either you forbid changes (many news site do that: you can notice that sometimes the slug and the title of a news article do not match), or your have to live with the occasional 404 when the slug changes. I've never seen any website having a system to keep the slug history, as it can be quite annoying (you won't be able to "reuse" a slug for example).
I'm deciding whether or not to make the slug mandatory in order to view a submission.
Right now either of these works to get to a submission:
domain.com/category/id/1/slug-title-here
domain.com/category/id/1/slug-blah-foo-bar
domain.com/category/id/1/
All go to the same submission.
You can also change the slug to whatever you want and it'll still work as it just checks for the category, id, and submission # (in the second example).
I'm wondering if this is the proper way to do this? From an SEO standpoint should I be doing it like this? And if not, what should I be doing to users who request the URL without the slug?
The slug in the url can serve three purposes:
It can act as a content key when there is no id (you have an id,so this one doesn't apply)
When just a url is posted as a link to your site, it can let users know what content to expect because they see it in the url
It can be used by search engines as a ranking signal (Google does not use url words as a ranking signal very much right now as far as I can tell)
Slugs can create problems:
Urls are longer, harder to type, harder to remember, and often get truncated
It can lead to multiple urls for the same page and SEO problems with content duplication
I am personally not a fan of using a slug unless it can be made the content key because of the additional issues it creates. That being said, there are several ways to handle the duplicate content problems.
Do nothing and let search engines sort out duplicate content
They seem to be doing better at this all the time, but I wouldn't recommend it.
Use the canonical tag
When a user visits any of the urls for the content, they should get a canonical tag like such:
<link rel="canonical" href="http://domain.com/category/id/1/slug-title-here" />
As far as Google is concerned, the canonical tag can even exist on the canonical url itself, pointing to itself. Bing has advised against self referential canonical tags though. For more information on canonical tags see: http://googlewebmastercentral.blogspot.com/2009/02/specify-your-canonical.html
Use 301 redirects
Before canonical tags, the only way to avoid duplicate content would be with 301 redirects. Your software can examine the url path, and compare the slug to the correct slug. If they don't match, it can issue a 301 redirect that will send the user to the canonical url with the correct slug. The stack overflow software works this way.
so these urls:
domain.com/category/id/1/slug-blah-foo-bar
domain.com/category/id/1/
would redirect to
domain.com/category/id/1/slug-title-here
which would be the only url that would actually have the content.
Assuming you're not ever going to change a page's slug, I'd just set up domain.com/category/id/1/ to do a 301 redirect (permanent) to domain.com/category/id/1/slug-title-here, and any time someone enters a slug which is incorrect for that article (domain.com/category/id/1/slug-title-here-oops-this-is-wrong), also 301 them to the correct address.
That way you're saying to the search engines "I don't have duplicate content, look, this is a permanent redirect" so it doesn't harm your SEO, and you're being useful to the user in always taking them to the correct "friendly url" page.
I suggest you to make a rel=canonical meta tag.
This prevents do a redirect each time considering someone can link your page with infinte variant like this:
domain.com/category/id/1/?fakeparam=1
From a SEO standpoint, as long as your only ever linking to one version of those URL's, it should be fine as the other URL's won't even be picked up by the search engine (as nowhere links to them).
If however you are linking to all 3, then it could hurt your rankings (as it'll be considered duplicate content).
I personally wouldn't make the slug required, but I would make sure that (internally) all links would point to a URL including the slug.
I am planning an informational site on php with mysql.
I have read about google sitemap and webmaster tools.
What i did not understand is will google be able to index dynamic pages of my site using any of these tools.
For example if i have URLs like www.domain.com/articles.php?articleid=103
Obviously this page will be having same title and same meta information always but the content will change according to articleid. So how google will come to know about the article on the page to display in search.
Is there some way that i can get google rankings for these pages
A URL is a URL, Google doesn't give up when it sees a question mark in one (although excessive parameters may get ignored, but you only have one). All you need is a link to a page.
You could alternatively make the url SEO friendly with mod_rewrite www.domain.com/articles/103
RewriteRule ^articles/(.*)$ articles.php?articleid=$1 [L]
I do suggest you give each individual page relevant meta tags no more then 80 chars and dont place the article content within a table tag as googles placement algorithm is strict, random non related links will also do harm to the rank.
You have to link to the page for Google to notice it. And the more links you have the higher up in Google's result list your page will get. A smart thing to do is to find a page where you can link to all of your pages. This way Google will find them and give them a higher ranking than if you only link to them once.
Hello I'm planning to developp a communication platform fully in Ajax and Long-Polling.
There will be no full page reloading !
So the website adress would always be www.domain.com
Do you recommend that for SEO ?
Forget about SEO, what about your visitors - will people be able to bookmark a page on your site and get back to where they want to be? Will they be be able to email a link to their friends to show them something?
Its not just Google that likes to have direct URLs to visit. Those direct URLs are vital for SEO, but they're also important for your human visitors too.
Google has a full specification on how to make ajax-powered sites like this crawlable.
The trick is to update window.location.hash with an escaped fragment whenever you want specific content to be linkable, and treated as its own page, without having to reload. For example, Twitter rewrites their URIs from http://twitter.com/user to http://twitter.com/#!/user.
From an SEO standpoint these are both valid and will be regarded as its own separate page. They can be directly linked to, and be used in browser history navigation. If you update your meta-data (keywords, description etc.) and sitemaps accordingly, SEO will be the least of your worries.
As long as you can generate a fully qualified link for each page, you should be fine if you generate a sitemap including those links and submitting it to google.
If you look on Twitter and FB, they #! in the URL so Google still crawls pages
If it's mostly using Ajax for content population, loading and state changes, then it's probably a bad model for SEO purposes anyway. Somewhat of a moot point by nature, no?
I have a classifieds website.
It has an index.html, which consists of a form. This form is the one users use to search for classifieds. The results of the search are displayed in an iframe in index.html, so the page wont reload or anything. However, the action of the form is a php-page, which does the work of fetching the classifieds etc.
Very simple.
My problem is, that google hasn't indexed any of the search results yet.
Must the links be on the same page as index.html for google to index the Search Results? (because it is currently displayed in an iframe)
Or is it because the content is dynamic?
I have a sitemap which works, with all URLS to the classifieds in the sitemap, but still not indexed.
I also have this robots.txt:
Disallow: /bincgi/
the php code is inside the /bincgi/ folder, could this be the reason why it isn't being indexed?
I have used rewrite to rewrite the URLS of the classifieds to
/annons/classified_title_here
And that is how the sitemap is made up, using the rewritten urls.
Any ideas why this isn't working?
Thanks
If you need more input let me know.
If the content is entirely dynamic and there is no other way to get to that content except by submitting the form, then Google is likely not indexing the results because of that. Like I mentioned in a comment elsewhere, Google did some experimental form submission on large sites in 2008, but I really have no idea if they expanded on that.
However, if you have a valid and accessible Google Sitemap, Google should index your classifieds fine. I suggest to use the Google Webmaster Tools to find out how Google treats your site and to diagnose any potential problems with crawling.
To use ebay is probably a bad example as its not impossible that google uses custom rules for such a popular site.
Although it is worth considering that ebay has text links to categories and sub categories of auction types, so it is possible to find auction items without actually filling in a form.
Personally, I'd get rid of the iframe, it's not unreasonable when submitting a form to load a new page.
that question is not answerable with the information given, to many open detail questions. if you post your site domain and URLs that you want to get indexed.
based on how you use GWT it can produce unindexable content.
Switch every parameters to GET
Make html links to those search queries on "known by Googlebot" webpages
and they'll be index