Canonical URLs for Search Engines

by Stephen Fluin 2010.01.18

One of the recent advancements in the HTML and page processing done by search engines, is the addition of the rel="canonical" link tag. This link tag is added to the head section of the HTML, and it indicates to search engines what the preferred URL of this page is. This is useful because with the advetn of dynamic pages over the past 10 years, we are currently in a place where often the same contents are accessible in many different ways, with many different options. This means that without this additional piece of metadata, a search engine can't tell if url?s=newest is the same page as url?s=oldest. This would be an example where the URLs represent the same content. It also can't tell if url?p=dyanmic is the same as url?p=history are the same page. This example could be a single PHP URL that shows different pages based on the get variables.

The solution is to add the following line of code to any URL that is addressed in multiple ways, or has multiple URLs for the single set of content.

<link rel="canonical" href="" />

Google is the first search engine to support this, but because it makes things easier for the search engine's processing, it will likely soon be supported by all major search engines. This technique can even be used cross-domain if you have content hosted and served across multiple domains.