Implementing support for a canonical page is easy, at least in technical terms. In many cases, it may be more difficult to determine where duplicates may be an issue within your site and which page to specify as the URL to index. Read on for help with both.
The first step in providing a canonical tag for the search engines is to decide where you may have duplicate content within your website. This isn’t always as simple as it may sound. Let’s recap a few ways that duplicate content might be a problem and what the canonical URL for those pages should be:
Gateway pages: Many sites have multiple entry pages. There are a number of valid reasons for configuring a site this way, one of which deals with tracking scripts. It’s easier to know which search engine or linking page a visitor comes from if you list a separate welcome page for each search engine or link. An example might look like “http://mysite.com/yahoo_index.php”. In this case, the canonical page will probably be “index.php”, “index.html” or “default.htm”, etc. – the entry page someone would see if they typed your site address directly into the browser.
Printer-friendly versions: There’s often a need to supply visitors with separate versions of pages, formatted to print nicely. In this case, you’re definitely duplicating the content that the search engine spiders will index. You’ll want the search engines to list the pages you’ve designed for web viewing, so your printer-friendly versions should contain references to the web versions in the canonical tags.
Pages with dynamic content: Dynamic content can often create duplication for the search engines that isn’t always obvious to the developer. An example might be a dynamically created menu that builds links to a product with different color choices or sizes. These links may all lead to the same product page, with only a query string as the difference, such as: “http://mysite.com/ballcaps/1234?color=black”. This means the search engine spider can follow each of the menu links to the same product page. Pages that might be called with a query string should contain canonical tags pointing directly to themselves.
There are certainly more cases than those listed above, but the examples will hopefully get you started off in the right direction. Remember that the less duplicate content the spiders find, the better your chances for high page ranking.
So, now that we know where, let’s get to the “how”. A canonical URL is specified with the tag. It belongs in thesection of your pages, and the structure looks like this:
The href value should be the page that best represents the duplicate pages, in other words, the page you want the search engines to index. This value can be an absolute reference, such as “http://www.mysite.com/index.htm” or a relative reference, like “index.htm”.
One other point to note is that the URL in the tag must be located within the same domain as the current page. Pages within subdirectories (mysite.com/products) or subdomains (products.mysite.com) will work, but if you need to reference a page outside the domain, you’ll need to learn to use the 301 redirect directive.
Implementing this new hint for the search engines is just that simple. Consider that it may help improve your site’s search engine positioning and we think you’ll agree it’s probably worth the time and effort.