Is Your Site a Victim of URL Canonicalization? Photo by Mark Knol

Is Your Site a Victim of URL Canonicalization?

by Nicolas Prudhon · 67 comments

Often in SEO, there are some things you may miss that may ulti­mately have a neg­a­tive effect on your site. Nico­las Prud­hon goes over that “one thing” you may have over­looked while work­ing on your site’s SEO.

You have been work­ing so hard to opti­mize your site and yet you may have for­got­ten one thing. If you are any bit con­sci­en­tious about the SEO of your site, you prob­a­bly have a good key­word, a good page title, back­links and so on.

Did you know that because of the one thing you for­got to do, part of that hard work is going to waste?

Take a few sec­onds to look at the fol­low­ing URLs:

  • http://www.mydomain.com
  • http://mydomain.com
  • http://www.mydomain.com/index.html
  • http://www.mydomain.com/index.php
  • http://mydomain.com/index.html
  • http://mydomain.com/index.php

If you are like most web­site out there, it is very likely that those 6 URLs actu­ally return the same page.

As you do fur­ther analy­sis of your pages from the pre­vi­ous links, you’ll see that you may encounter dif­fer­ent sets of back­links and PR.

This is due to URL canonicalization

Indeed, to you and your vis­i­tors, all those links may resolve to the same page, but for the search engines, those are 6 very dif­fer­ent pages. The deci­sion is then left to the search engine as to which one it should show you.

In most of the cases, the URL returned will be the one with the most inbound and out­bound links, thus the process is invis­i­ble to the web­mas­ter in most instances.

The main prob­lem about this when it comes to SEO is the sim­ple fact that part of your efforts is leaked onto those dif­fer­ent URLs.

In order for you to pal­li­ate to this prob­lem, you must instruct the search engines that those URLs are in fact only one.

By doing so you’ll be able to opti­mize the SEO work­ing efforts.

The process is done very sim­ply by adding a few com­mand lines to your .htac­cess file as follows:

  • To redi­rect a http://mydomain.com to http://www.mydomain.com:
    RewriteEngine on
    RewriteCond %(HTTP_HOST) ^mydomain.com
    RewriteRule (.*) http://www.mydomain.com/$1 [R=301,L]
    
  • To redi­rect your http://www.mydomain.com/index.html to http://www.mydomain.com:
    RewriteEngine on
    RewriteCond %(THE_REQUEST) ^[A-Z] {3,9}\ /.*index\.html\ HTTP/
    RewriteRule (.*) index\.html$ /$1 [R=301,L]
    
  • To redi­rect your http://www.mydomain.com/index.php to http://www.mydomain.com:
    RewriteEngine on
    RewriteCond %(THE_REQUEST) ^[A-Z] {3,9}\ /.*index\.php\ HTTP/
    RewriteRule (.*) index\.php$ /$1 [R=301,L]
    

Note: don’t for­get to replace “mydo­main” by your actual domain name.

What This Code Does

By adding those com­mands to your .htac­cess file, you are telling the search engines that regard­less of which of those URL is requested, they should take you to http://www.mydomain.com

The method used is called a 301(permanent) redi­rect. It is an extremely pow­er­ful SEO tool as any page moved or redi­rected by this mean doesn’t lose its value.

It may sounds a bit com­pli­cated and tech­ni­cal, but truly as far as you are con­cerned, it’s only a mat­ter of copy and paste (and replac­ing “mydo­main” with your actual domain name).

Con­grat­u­la­tions, you now can enjoy full ben­e­fits from your SEO work!

Go to top

Article by Nicolas Prudhon

Nicolas Prudhon is an Internet Marketing & SEO strategist, as well as published author. Through SEO Help by Nicolas Prudhon, he's dedicated to share all his knowledge and experience. Join his latest free SEO training course "21 Days SEO Mastery" now!

  • Summary

    URL Canonicalization is a very important, but very neglected step in SEO. Learn what URL Canonicalization is, and multiple ways to implement it on your website.

  • Key Points

    • Files such as index.html, index.php, etc. return the same page, but will be looked at as two different pages by Search Engines.
    • URL Canonicalization can be achieved with some simple .htaccess edits (AKA, 301 Redirects).

From Planning to Earning

A free course that explains all you need to know about maintaining and building a powerful, money making blog.

Information is delivered through a beautiful web guide & a 10 day email course (+ a weekly newsletter). Sign up, or learn more!

Alex March 23, 2009 at 2:57 pm

Thanks for the great post Nick. I’ve never heard of URL Canonicalization, but this is something that I’m now going to apply to this blog, and any future blogs. I’m no SEO genius, so this really helps.

Thanks a lot, and I hope to see some more posts from you soon! :)

Reply

Nicolas Prudhon March 23, 2009 at 8:20 pm

Actually Alex, despite the fact that this article is actually much more “technical” than what I write usually, it is exactly because a lot of people never heard about it that I though it may come handy ;)

Anyway, I’m glad to be of some help and thank you for publishing my article so fast!

Nicolas Prudhon’s last blog post..SEO help with Niche Marketing

Reply

Stuart Conover March 23, 2009 at 4:34 pm

The major search engines actually added a tag recently that allows you to tell them what url you want for canonicalization. Within 2 days of the announcement there were 3 wordpress plugins that were out to auto create the tag on pages for you. Just another route to look into.

I really hate to link to one of my own sites in a comment but I did a writeup a QUICK post here: http://www.stuartconover.com/2009/02/19/canonical-links-plug-ins/ with links to 2 Wordpress and 1 Joomla plugin. Figured they might prove useful after today’s post ;)

Reply

Alex March 23, 2009 at 8:19 pm

That’s pretty interesting. Thanks for sharing that Stuart!

Reply

Nicolas Prudhon March 23, 2009 at 8:24 pm

Hi Stuart, I believe that it is not the “link” itself that gives a spam feel, but rather if it is related and helpful to the discussion or not.

As Alex mentioned earlier, people have very few information about it, so whatever can be useful and helpful for their understanding is more than welcome!

Honestly, I wasn’t even aware myself of those plugins so I have to thank you too for sharing this resources.

That’s really something I love about those blog discussion, we all can share and learn so much!

Nicolas Prudhon’s last blog post..Why my site is not indexed in Google?

Reply

Dennis Edell March 24, 2009 at 11:25 am

Will installing the plugins have the exact same effect? I’d rather not mess with the .htaccess if I don’t have to.

Btw, my commentluv shows another Nicolas guest post. ;)

Dennis Edell’s last blog post..3 Secrets to Writing for the Search Engines

Reply

Nicolas Prudhon March 24, 2009 at 8:45 pm

Hi Dennis, through what I read about this plugin, although they may look like doing the same thing, I think I found a major difference there.

“This simple plugin helps you easily tell the search engine bots the preferred version of a page by specifying the canonical properly within your head tag.”

The code used in the .htaccess file uses the 301 redirect which not only tells the search engines your exact canonical setting, but ALSO maintains and unify all your SEO starts (ranking, PR, etc…)

The code in your .htaccess file is merely 3 small lines, against having some extra meta information in the header of each and every page of your site.

It’s very easy for you to see if you have done the things right. After updating your .htaccess file, just try to access your site without the www; if the page that opens automatically is the one with www you have done the things correctly.

Nicolas Prudhon’s last blog post..Discerning mind against Information Overload

Reply

Dennis Edell March 25, 2009 at 1:48 pm

Thanks for that, I’ll give it a go.

Dennis Edell’s last blog post..3 Secrets to Writing for the Search Engines

Reply

David Lemcoe March 23, 2009 at 9:15 pm

Very well done. .htaccess and httpd.conf files make all the difference in SEO in most cases.

Great post,

Big D

Reply

Nicolas Prudhon March 23, 2009 at 9:21 pm

Thank you David,

I really appreciate the comment.

Nicolas Prudhon’s last blog post..Internet Marketing Starts with Personal Relationships

Reply

Dean Saliba March 24, 2009 at 9:21 am

I have also neveer heard of URL Canonicalization before.

Something I think a lot of us should be looking into.

Dean Saliba’s last blog post..Blogitive – Nice To Add To Your Collection

Reply

Nicolas Prudhon March 24, 2009 at 8:00 pm

Thanks for the comment Dean!

It’s not really surprising actually. As human we don’t really ask ourselves this kind of question. Common sense tells us that it’s the same page that is returned… but search engines don’t have common sense.

Nicolas Prudhon’s last blog post..Discerning mind against Information Overload

Reply

Alex March 24, 2009 at 8:39 pm

I have always realized that index.html, index.php, etc. are all different pages, but never really factored in the SEO of it all. Yet, I knew that using WWW and Non WWW is a no-no, and you should redirect them just like with the index pages. Seriously disappointed with myself for not seeing that at first, haha.

Reply

Nicolas Prudhon March 24, 2009 at 8:50 pm

Just a quick note about that.

Although my example shows you how to redirect a non www site towards a www site, the opposite can be done too.

From a SEO point of view it doesn’t matter which version you choose, with www or without www. What matters if for you to be consistent with your choice.

Nicolas Prudhon’s last blog post..Discerning mind against Information Overload

Reply

Miami web design March 24, 2009 at 9:51 am

yes , valid point raised ,Google see links with www and without www as different pages hence your link juice is split between two pages . So first thing while doing on-site optimization we do is to redirect non-www url to www url

Reply

Nicolas Prudhon March 24, 2009 at 8:06 pm

You’re absolutely right. Actually most of the people will only think of redirecting their non-www towards their www url, but it’s good to include the index page, when we think about it, it’s the same too and it doesn’t take much more work to do so too.

Nicolas Prudhon’s last blog post..WVO is taking over SEO Part 3

Reply

Kai Lo March 26, 2009 at 10:56 pm

I’m glad I started blogging with a purchased domain url. If I didn’t, I would have a lot of backlinks to a domain.blogspot.com link instead.

Kai Lo’s last blog post..Web Domain Value

Reply

Nicolas Prudhon March 26, 2009 at 11:38 pm

Considering how cheap domain names are now, I would definitely recommend people to get one as soon as possible. It’s really sad to see your hard work going to waste just because you didn’t planned properly at first.

Nicolas Prudhon’s last blog post..WordPress Tutorial

Reply

Evan March 30, 2009 at 5:10 pm

Great informational post, never knew about this sort of thing.

Had to rattle my brain a little for this one! ;)

Reply

Nicolas Prudhon March 30, 2009 at 6:24 pm

Hi Evan, sorry for this post to be a bit technical, however without going into the details in the meaning of the code itself, the concept is not that hard to understand.

I’m glad that I have been able to help you there.

Nicolas Prudhon’s last blog post..The 5 Best Friends of the full time Internet Marketer

Reply

Davey March 31, 2009 at 1:39 am

So that means that www or non www, it doesn’t matter? I mean I can change the domain name of my blog by going to wordpress settings and changing the blog URL. But what do you suggest? getting a www attached to my blog name or just keep it as it and maintain the uniformity? Does it have something to do with the SEO?

Reply

Nicolas Prudhon March 31, 2009 at 5:24 am

Hi Davey, that’s correct, www or non www is the same, it’s just a matter of personal preference only. Whatever form you use, make sure to keep using it everywhere, if you use the non www make sure not to include www in your inbound links.

And yes, it has to do with your SEO and where you distribute your “link power”, the tip showed in this article is to help you channel all those leaks into one uniform URL.

Reply

Andrew April 23, 2009 at 4:34 am

I couldn’t get it to work. The www redirect did nothing and the the other 2 gave error 500. My domain is running on RedHat host. I changed the mydomain to the correct one.

Reply

Nicolas Prudhon April 23, 2009 at 8:50 pm

Hi Andrew,

Quite surprising but anyway, try the following code:
===========================================
Options +FollowSymlinks
RewriteEngine on
rewritecond %{http_host} ^domain.com [nc]
rewriterule ^(.*)$ http://www.domain.com/$1 [r=301,nc]
===========================================
Note: don’t forget to change “domain” by your actual domain name!

Let me know how it goes.

Nicolas Prudhon’s last blog post..Targeting your Market with Google Trends

Reply

Andrew April 25, 2009 at 11:45 am

Thanks Nicolas!
That worked.

Reply

Nicolas Prudhon April 25, 2009 at 12:01 pm

Awesome Andrew!

If you have any other questions, don’t hesitate to ask, I’m here to help!

Nicolas Prudhon’s last blog post..Can We Survive the “NoFollow” Black hole?

Reply

Mike Dalton November 17, 2009 at 1:45 pm

Thanks Nicholas – Great resource
I’m already using the redirect to the www – wished I’d run across this a year ago!
Howver, I am getting 500 errors trying to use the code to get the .com instead of the .com/index.htm (did replace the .html in your code with .htm)
I’m using Godaddy running Linux if that makes a difference.

Thanks much!

Reply

Manual Web Directory May 10, 2009 at 12:51 pm

Finally someone who can write a good blog ! . This is the kind of information that is useful to those want to increase their SERP’s. I loved your post and will be telling others about it. Subscribing to your RSS feed now. Thanks

Reply

Nicolas Prudhon May 10, 2009 at 7:29 pm

Yes, Alex and Janith are doing an excellent job with this blog!

Nicolas Prudhon’s last blog post..Optimizing for Keywords that Get Clicked On

Reply

Make Money Online May 28, 2009 at 11:48 am

I’ve tried it 2 days ago, I want to redirect my blog URL with W W W to without W W W. After I’d done it, I can’t access my blog anymore. So now I remain the same coding without changing anything.
Btw, with these way, can we also redirect single post as well?

Regards,
Lee

Reply

Nicolas Prudhon May 28, 2009 at 6:40 pm

Hi Lee, I replied to you on my blog.

Nicolas Prudhon’s last blog post..SEO Quiz – Questions and Answers

Reply

Miami Web Design August 3, 2009 at 4:16 am

Thanks. I’ll update my .htaccess files right away

Reply

ZQ | Travel Blog August 8, 2009 at 5:08 am

Great post – learned something new today – will edit my htaccess files now (now i realised how impt it is to the security of my blog)

ZQ
ZQ | Travel Blog´s last blog ..Saturday Singapore Issue 3: Singapore Flyer

Reply

Nicolas Prudhon August 9, 2009 at 12:39 am

Actually the primary use is for SEO, but it does indeed potentially also solve some security issues. ;)

Reply

Adam Baird August 19, 2009 at 5:04 pm

Doesn’t Thesis automatically take care of this for you?
Adam Baird´s last blog ..A Lesson From Moving: Consistency!

Reply

Seth August 19, 2009 at 5:20 pm

Yes it does! Another great reason to get Thesis Theme.

Reply

web Gift September 12, 2009 at 9:02 pm

thanks for sharing this info :)

Reply

Volksphone September 19, 2009 at 9:33 am

Thanks for this mod_rewrite tutorial. I have always problems to set it up correctly.

Reply

New Cars Guru September 22, 2009 at 2:14 am

The best way for me is to choose http://www.domain.com over domain.com. It seems more natural to me this way. And if you have set up one way or another, please remember NOT TO SWITCH later, when you have brought your blog to some level of development, as it may lower your PR rank, and SERP results…
New Cars Guru´s last blog ..New Porsche 911 GT3 RS

Reply

Prisqua October 22, 2009 at 9:00 am

Thanks for the info but when I tried to add those codes to my .htaccess file I got a “500 Internal Error”. I saw on another comment you are giving another code, but it is to replace the ones you mention? I am just a bit confused…
Prisqua´s last blog ..Twitter Weekly Updates for 2009-10-18

Reply

Nicolas Prudhon November 10, 2009 at 1:41 am

Hi Prisqua,

Sorry for the late reply.

Yes the other code I give is to be used instead of the previous one. Depending of your server configuration and .htaccess file you may have experienced some difficulties with the first code, but this should be fine with the other one I provided.
Nicolas Prudhon´s last blog ..Staying Ahead Of Your Competition In 5 Steps

Reply

Mike April 25, 2010 at 8:11 am

This is main part in SEO mistakes as i have seen people building links with main url and internal linking with index.html or home.php etc
Mike´s last blog ..Using Podcasts To Increase Traffic Flow To Your Website

Reply

Danka April 26, 2010 at 1:02 pm

Hi Nicolas
If I run my blog Thesis theme, should I just forget updating .htaccess altogether? Thanks

Reply

Prasen Dutta May 6, 2010 at 8:02 am

Cool posting here. please keep in continue.

Regards~
Prasen Dutta
Business Development LLC, at software development india

Reply

Miss WBS May 21, 2010 at 7:18 pm

Great tutorial, i didn’t know about this, i will change it tomorrow, need to create a htfile first because i don’t have it in my files…

Reply

Cruise Forums May 23, 2010 at 10:44 pm

Its just happened to me 3 days ago, 6 keywords of my site are at top of Google and suddenly just in four hours all of them disappear as they were not even on Google. First think that come into my mind was that i got penalize, but that was not the case i work on it and at last solve the problem, but really if i had find this resource i will easily solve it. Really embarrassing issue for me Canonicalization?

Reply

VirtueMart Templates May 26, 2010 at 10:02 am

Yeah, the problem is that you almost always judge others by applying your own standards – what I mean is that there are some things like this URL canonization which you seem to have known for years, and consequently you by default think that almost all other webmasters already know that – it’s just that simple. But then you find out that hundreds or thousands of webmasters repeat one and the same mistake – and you just keep being surprised at how that is possible.

Reply

Car Transportation May 29, 2010 at 3:17 pm

I think All In One SEO takes care of this for Wordpress.
Car Transportation´s last blog ..Video: What the recalled Lexus LS steering wheel defect looks like

Reply

Laura June 6, 2010 at 8:38 pm

Hi there, great info, I just can’t seem to find the .htac­cess file, I’m using the thesis theme.

thanks!

Laura

Reply

Laura June 7, 2010 at 9:47 pm

Hi, the code is not working for me either. What else can I try to do to make this work? I have thesis theme, so do I still need to add this code?

Reply

Laura June 7, 2010 at 9:58 pm

Hey there again, ok, so I put this code in:

Options +FollowSymlinks
RewriteEngine on
rewritecond %{http_host} ^positiveattitudequotes.com [nc]
rewriterule ^(.*)$ http://www.positiveattitudequotes.com/$1 [r=301,nc]

RewriteEngine on
RewriteCond %(THE_REQUEST) ^[A-Z] {3,9}\ /.*index\.html\ HTTP/
RewriteRule (.*) index\.html$ /$1 [R=301,L]

RewriteEngine on
RewriteCond %(THE_REQUEST) ^[A-Z] {3,9}\ /.*index\.php\ HTTP/
RewriteRule (.*) index\.php$ /$1 [R=301,L]

And as soon as I do that, it gives me the 501 error redirect message.
Where exactly in the document should this code be placed? I thought maybe it might have something to do with that, so I tried it in a few different places, but as soon as I save it, it goes to an error message….ARG!! What else can I do? Because I can see that mydomain.com/index.html and .php are there and when I type it in, it goes to that page and retains the .index.html…so I’m assuming that right now I’m experiencing URL canonization, right? How do I fix this?

Thanks,
Laura

Reply

Peter Adrien - vendre auto June 25, 2010 at 6:36 am

Hi Nicolos, Important topic to blog with. 301 Redirecting codes differ with respective to their servers.

I would like to share the codes for two different servers here.

1. For Apache server (non www to www)

RewriteEngine on
RewriteCond %{HTTP_HOST} ^(domain\.com)(:80)? [NC]
RewriteRule ^(.*) http://www.domain.com/$1 [R=301,L]

We should place these codes in .htaccess file.

2. For Microsoft ISAPI server, (non www to www)

RewriteCond Host: ^domain\.com
RewriteRule (.*) http\://www\.domain\.com$1 [I,RP]

File redirctions

RewriteCond Host: www\.domainname\.com
RewriteRule /index\.html / [I,RP]

But it requires Microsoft ISAPI tool to carry on this process.

Thank you.

Reply

Beauchamp Web Design July 12, 2010 at 3:28 pm

Great tips and advice thanks for the post
Beauchamp Web Design´s last blog ..Local Firm Directory Web Design

Reply

Phil July 28, 2010 at 1:33 am

He dear

Thanks for this mod_rewrite tutorial. I have always problems to set it up correctly.

Reply

Phil July 28, 2010 at 11:24 pm

I think All In One SEO takes care of this for Word press. Thanks for sharing this information.

Reply

Ian August 4, 2010 at 12:14 pm

Very informative. I’ve known for a while that 301 redirects are essential for websites I just never knew how exactly to code them. Thanks a lot for that. The rel=”canonical” tag is also very effective for this purpose. It also requires a little less technical knowledge and is endorsed by Google. This is especially useful for guest posts as it let the engines know where the content originated from. Hopefully sometime in the near future scrapers will begin utilizing the canonical tag :) .
Ian´s last blog ..home

Reply

nick August 5, 2010 at 9:34 am

i read this blog and i think this is nice blog….great info, I just can’t seem to find the .htac cess file, I’m using the thesis theme…Actually the primary use is for SEO, but it does indeed potentially also solve some security issues as well as valid point raised ,Google see links with www and without www as different pages hence your link juice is split between two pages . So first thing while doing on-site optimization we do is to redirect non-www url to www url…

thanks for sharing informaion…

Reply

Kimi August 8, 2010 at 1:51 am

This has been a confusing stuff with newbies, including me. Is duplicate content actually still exist?

And also, i had bad experience about changing the address, because it needs a whole lots of time for google to re-indexed my blog a again a few months back.

Thanks for this infos, i have checked mine, thanks god, it redirect to one address :)
Kimi´s last blog ..Best free web host for wordpress

Reply

MTB Brakes August 9, 2010 at 6:00 pm

This canonicalization is tough nut to get your head around. I mean, I have my site redirected with .htaccess to the non-www version which was fine, but what about that trailing slash? Does that count as a seperate page in Google’s eyes? Even worse I’ve seen addresses appearing with a hash also. Eg: http://example.com/# – yet it points to the home! I’m lost to be honest.
MTB Brakes´s last blog ..Its nearly august…

Reply

sam August 26, 2010 at 9:20 am

Great tutorial, i didn’t know about this, i will change it tomorrow, need to create a ht file first because i don’t have it in my files. Great post – learned something new today.

Reply

seo specialists September 1, 2010 at 2:46 am

This is the most comprehensive guide I have come across. Thanks for sharing this with us! There are many things even experienced bloggers can learn from this post.

Reply

Leave a Comment

CommentLuv Enabled

6 trackbacks